Applied ML / Data Engineer
RounDC is a service for automated investor matching for startups:
we take data from Telegram/WhatsApp, enrich it using LLM, and build smart matching of startups and funds/angels on top of our own CRM.
We are looking for an Applied ML / Data Engineer who will take charge of developing the core of the matching engine and the data pipeline.
Tasks:
- Design and develop a startup and investor matching service based on embeddings, rules, and tools like Splink.
- Configure entity resolution / deduplication for people and company databases in the CRM.
- Integrate ML logic with the current stack (Python, LLM API, CRM, and later - Elasticsearch).
- Introduce quality metrics (precision/recall, hit-rate, etc.), improve matching quality based on user feedback.
Requirements:
- Strong Python, preferably experience with production code for data/ML tasks.
- Practical experience with text embeddings and vector search.
- Experience in entity matching / deduplication (fuzzy matching, record linkage; Splink or similar is a plus).
- Confident SQL and experience working with application databases (CRM, OLTP schemas).
- Understanding how to bring ML solutions to production: API services, queues/streaming, logging, monitoring.
Will be a plus:
- Experience with Elasticsearch or another search engine.
- Experience integrating with messengers or high-load data ingestion.
- Experience in recommendation systems or product data science.