RounDC is a service for automated investor matching for startups:
we pull data from Telegram/WhatsApp, enrich it using LLMs, and build smart matching of startups and funds/angels on top of our own CRM.
We are looking for an Applied ML / Data Engineer who will take charge of developing the matching core and data pipeline.
Responsibilities:
- Design and develop a service for matching startups and investors based on embeddings, rules, and tools like Splink.
- Configure entity resolution / deduplication for databases of people and companies in the CRM.
- Integrate ML logic with the current stack (Python, LLM API, CRM, and further - Elasticsearch).
- Introduce quality metrics (precision/recall, hit-rate, etc.), improve matching quality based on user feedback.
Requirements:
- Strong Python, preferably experience with production code for data/ML tasks.
- Practical experience with text embeddings and vector search.
- Experience in entity matching / deduplication (fuzzy matching, record linkage; Splink or similar is a plus).
- Confident SQL and experience working with application databases (CRM, OLTP schemas).
- Understanding of how to bring ML solutions to production: API services, queues/streaming, logging, monitoring.
Will be a plus:
- Experience with Elasticsearch or other search engines.
- Experience integrating with messengers or high-load data ingestion.
- Experience in recommendation systems or product data science.