mlsquare was started around 2018 as an open source iniative to develop ML techniques to help understand, develop, and improve in a holistic sense.

It was argued in the position paper that one has to go beyond model-centric ML and performance-centric ML, and should embrace holistic ML that includes aspects such as interoperability, explainability, uncertainty quantification, efficient training, among others. Due to the scale, opacity, complexity and cost of developing modern ML models (implying deep learning models), one has to take assistance of other ML models. For example, one ML model can explain the predictions of another ML model, so on and so forth. ML for ML is a recurrent theme in the problems being tackled here (hence the name ML square).

As a concrete measure, transpilation and model distillation techniques were developed to port models from one framework (say sklearn) to another (say PyTorch). This work was published in the 1st AI/ML Systems conference [ paper, code ].

Fast forward to the post-GPT world, while building MVPs with LLMs like chatGPT became much more easier, building reliable and trustworthy LLM-based AI products are much harder. So multiple models working together (LLM for LLM) to address a complex set of challenges (like evaluating an LLM, for eg.) is becoming more prevalent and relevant in this post-GPT, multi-model, multi-agentic paradigm. Further, the entire supply-chain of manufacturing these LLMs requires massive concentration of money, might (talent) and muscle (compute). As a result, building local, customized SLMs/LLMs in vernacular languages is extremely hard.

These challenges were addressed in FedEm from a pedagogic point of view but lot more needs be done. Therefore, the next attempts will be about creating an alternative build process to develop LLMs/SLMs. But the goal remains the same – democratize AI/ML for people, by people, with ML-for-ML.

Yours openly
The Saddle Point

Follow on X/Twitter
Follow on LinkedIn
Join the Discord server here