* Team8 Portfolio Company
Harmonya helps retailers and manufacturers overcome the limitations caused by legacy data structures and unlock the true value of their data.
Harmonya is an AI-powered product data classification and enrichment platform for retailers and manufacturers. Leveraging proprietary ML and AI models, Harmonya synthesizes data from trillions of alternative data points to generate a holistic and dynamic view of products sold across the country.
In this role, you will:
- Be part of a data science team where you will lead end-to-end ML projects from data extraction through model validation to model deployment.
- Develop various ML Models, mostly focused on NLP models such as named entity recognition, keyword extraction, and text classification to create the ultimate taxonomy in the world of fast-moving products, Graph theory model to identify various relationships between products, and trend analysis models to surface the most interesting consumer-related trends
- Discover and translate business challenges into data pipelines and models.
- Manage and design customer-specific analytical solutions using Machine Learning.
- Work closely with management, and data engineering teams to integrate data collection, data quality, and core model output into production systems.
- Convey complex analysis results clearly and with conviction to stakeholders at all levels.
- +3 years of experience with data science projects from conception to production.
- Experience with NLP practices such as part of speech tagging, named entities recognition, and word embeddings.
- Broad knowledge of data science modeling techniques (from RandomForests to RNNs and anything in between) and their use within the industry.
- Fluency in Python and hands-on experience with data science and NLP packages (spaCy, NLTK, gensim, TensorFlow, pyTorch, scikit-learn).
- Experience in large scale training and model evaluation in an enterprise environment.
- Big Advantage: Experience with Cloud Machine Learning Platforms (Google ML Engine, AWS SageMaker) and/or commercial tools (DataRobot, BigML or others).