What ML tools do you use in your daily routine?
In a nutshell:
- The standard PyData stack: NumPy, pandas, SciPy, etc.,
- Scikit-Learn, XGBoost, and Keras (TensorFlow),
- SageMaker, MLFlow,
- We have our internal ML platform at OLX Group.
Other tools I use are not strictly ML-related: Kubernetes, Airflow, Spark, and many AWS services such as Batch, Lambda, Kinesis, etc. There are also many smaller ones that I don't use day-to-day, but they are pretty helpful. For example, Numba is a great tool. I've probably forgotten about a bunch of other super helpful things. Do you use transformer-based metric tools like BERT for MT tasks? What LM metrics will use more such things in the future?
Nope, we don't. Even though OLX is available in many markets, we haven't had a use case for that. And I also personally don't have any experience with that. What do you think about feature-store tools like Tecton and Feast? When is it a good practice to use them, and when not?
I haven't used Tecton, and setting up Feast seems mission impossible. We usually go with more simple stuff based on Dynamo, and it works well so far. There is a lot of new ML research and techniques coming out all the time. How do you stay updated without getting overwhelmed? Or do you only look up new things after stumbling on a particular problem?
First, let me say how I stay updated with the tools. I mainly look at open-source tools. Because there are so many, I can be selective. And I invite the authors to demo the tools. This is how I see what's out there. Shameless plug: the interviews go here
As for other things, I don't try to stay up-to-date. If enough people talk about something — both in communities/social media and at work — then I look at it.