HOJJAT RAKHSHANI

Data Scientist

image

I'm a Data Scientist with a P.h.D in Computer Science and am currently involved in the digital transformation of store assortment optimization at Decathlon - one of the largest sporting goods retailers in the world with 100,000+ teammates across 70 countries. In the past two years, I have developed real-world statistics and machine learning problem-solving skills to draw innovative solutions out of large and diverse data sets.

Previously, I worked on conducting research in the fields of optimization and AutoML as part of my Ph.D. dissertation at the University of Upper Alsace. I have won several awards for optimization majors, given technical talks to data scientists at conferences and workshops, publications at the highest international level, and built interactive web apps for data visualization.

I have over half a decade of experience in cutting-edge AutoML pipelines, ETL, forecast modeling, clustering, regression analysis, manifold learning, visualization, A/B testing, and optimization.

CV Download

Work Experiences

Data Scientist

Decathlon | May 2021 – Present
  • Applied LLM models for identity extraction in unstructured data using LangChain and Prompt engineering. Transitioning to the research and development team, I now focus on advanced experiments including training and developing custom models, as well as specializing in trustworthy models deployed on DataBricks and conducting a PoC on Amazon Bedrock.
  • Developed assortment optimization solutions to maximize the expected revenue and minimize stock cost for physical stores, resulting in 80 million euros of total sales.
  • Delivered a 1-year forecasting model to predict turnover for each store and family using Amazon SageMaker DeepAR.
  • Presented XGBoost regression to infer the effect of Covid data presence on stores forecast models
  • Supervised to analyze the needs, define the target stack and support the team to streamline and move our AI solutions on SageMaker, DataBricks, and Airflow.

Research Scientists

UHA | July 2020 - April 2021
  • Proposed an AutoML pipeline that identifies links between similar scientific articles. This project led to the creation of a precise classifier reaching an accuracy of 90% on the final results
  • Directed neural architecture search to find and train deep residual networks for time series data. The conducted experiments on 85 instances reveal the proposed model reaches new state-of-the-art accuracy compared to HIVE-COTE model.
  • Examined a network interdiction multi-depot vehicle routing model in a collaboration with the University of Kaiserslautern.

Ph.D. Research Assistant

UHA | May 2017 - June 2020
  • Proposed a novel optimization technique based on transfer and ensemble learning to reduce the required computational resources by storing knowledge gained while solving optimization problems to a different but related one.
  • Applied metaheuristics on the Two-Stream Inflated 3D architecture model, pre-trained on the ImageNet and the Kinetics source datasets, to optimize crowd movements prediction on the Crowd-11 target dataset.
  • Formulated a multi-objective framework for the automatic configuration of machine learning models.

TECHNICAL SKILLS

    Data Science: A/B testing, optimization, big data pipeline (cleansing, wrangling, visualization, modeling, interpretation), AutoML, statistics, time series, Scrum fundamentals, Github
    Programming Languages: Python (Pandas, scikit-learn, pytest, Tensorflow, PyTorch, SciPy, NLTK, Gensim), SQL, R, C++, Java
    Cloud Machine Learning: AWS (SageMaker, ECR, EMR, S3, RedShift), Spark, DataBricks, Airflow

Projects

Multi-depot Vehicle Routing

In a collaboration with University of Kaiserslautern, we propose a network interdiction capacitated multi-depot vehicle routing problem that tries to determine the worst-case scenario of the consequences of a large-scale earthquake by finding the set of most critical network edges that in the case of collapsing impose the maximum effect in the relief distribution system. My role here is to fully code the developed solution algorithms.
Ongoing Research

Video Architecture Search

In this project, we study the application of image neural search methods for enhancing the performance of supervised deep learning models for the crowd movements classification. In contrast to models designed for images, we are able to improve the results form 47.4% to 60.6%. Besides, the number of model parameters is reduced from 24 to 12 million parameters.
Read More

Machine Learning for Information Retrieval

This project investigates to what extent a new machine learning pipeline may preferentially identify links between similar scientific articles. Automated machine learning is applied to ease the search for a new pipeline. We show that a newly designed model achieves an accuracy of 90%, compared to the best standard classifier with an accuracy of 82%.
Read More

Neural Search for Time Series

Neural search has achieved great success in different computer vision tasks such as object detection and image recognition. This project aims to find and train deep residual networks for time series data. We conducted extensive experiments on 85 instances from the UCR archive. The experimental results reveal that our proposed model reaches new state-of-the-art accuracy, by designing a single classifier that is able to beat the HIVE-COTE, which is an ensemble of 37 individual classifiers.
Read More

Protein Structure Prediction

Protein structure prediction plays an important role in the field of computational molecular biology. Although powerful optimization algorithms have been proven effective to tackle the potein problem, researchers are faced with the challenge of time consuming simulations. In this project, we introduce a new algorithm which makes use of the machine learning models to address the aforementioned issue. The introduced algorithm significantly outperforms the other competitive algorithms for the adopted all-atom model on the protein data bank.
Read More

PARTICIPATIONS

Publications

International Conferences

  • H. Rakhshani, H. Ismail-Fawaz, L. Idoumghar, G. Forestier, J. Weber, J. Lepagnot, M. Brévilliers, P. Muller, Optimizing deep residual neural networks for time series classification, The 2020 International Joint Conference on Neural Networks(IJCNN), 2020, Glasgow.
  • H. Rakhshani, B. Latard, M. Brévilliers, J. Weber, J. Lepagnot, G. Forestier, M. Hassenforder, L. Idoumghar, Automatedmachine learning for information retrieval in scientific articles, 2020 IEEE Congress on Evolutionary Computation (CEC), 2020, Glasgow.
  • H. Rakhshani, L. Idoumghar, J. Lepagnot, M. Brévilliers, MAC: Many-objective Automatic Algorithm Configuration, In:Deb K. et al. (eds) Evolutionary Multi-Criterion Optimization, Lecture Notes in Computer Science, Springer, 2019, vol11411.
  • H. Rakhshani, L. Idoumghar, J. Lepagnot, M. Brévilliers, From feature selection to continuous optimization, InternationalConference on Artificial Evolution (EA-2019), 2019, Mulhouse France. LNCS Volume, pp.1-8.
  • H. Rakhshani, L. Idoumghar, J. Lepagnot, M. Brévilliers and E. Keedwell, Automatic hyperparameter selection in Autodock,2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, Spain, 2018, pp. 734-738.
  • E. Keedwell, M. Br ́evilliers, L. Idoumghar, J. Lepagnot and H. Rakhshani, A Novel Population Initialization Method Basedon Support Vector Machine, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki,Japan, 2018, pp. 751-756.
  • H. Rakhshani, L. Idoumghar, J. Lepagnot, M. Brévilliers and A. Rahati, Accelerating Protein Structure Prediction UsingActive Learning and Surrogate-Based Optimization, 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro,2018, pp. 1-6.
  • H. Rakhshani, L. Idoumghar, J. Lepagnot, M. Brévilliers, Application of the surrogate models for protein structure prediction, 7th International Conference on Metaheuristics and Nature Inspired Computing (META’18), Oct 2018, Marrakech in Morocco.

International Peer-reviewed Journals

  • S. Ghambari, H. Rakhshani, J. Lepagnot, L. Jourdan, L. Idoumghar, Unbalanced budget distribution for automatic algorithm configuration. Soft Computing, 2022, vol. 26, no 3, p. 1315-1330.
  • H. Rakhshani, L. Idoumghar, S.Ghambari, J. Lepagnot, M. Bévilliers, On the performance of deep learning for numerical optimization: an application to protein structure prediction. Applied Soft Computing, 110, 107596.
  • H. Rakhshani, L. Idoumghar, J. Lepagnot, M. Bévilliers, Speed up differential evolution for computationally expensiveprotein structure prediction problems, Swarm and Evolutionary Computation, 2019, vol 50, pp. 1-18.
  • H. Rakhshani, E. Dehghanian, A. Rahati, Enhanced GROMACS: toward a better numerical simulation framework, 2019,Journal of molecular modeling, vol 25, p.355.
  • S. Etedali, H. Rakhshani, Optimum design of tuned mass dampers using multi-objective cuckoo search for buildings underseismic excitations, 2018, Alexandria engineering journal, vol 57, pp. 3205-3218.
  • H. Rakhshani, A. Rahati, Snap-drift cuckoo search: A novel cuckoo search optimization algorithm, 2017, Applied SoftComputing, vol 52, pp. 771-794.
  • W.W. Koczkodaj, J.P. Magnot, J. Mazurek, J.F. Peters, H. Rakhshani, M. Soltys, D. Strza lk, J. Szybowsk, A. Tozzi, Onnormalization of inconsistency indicators in pairwise comparisons, 2017, International Journal of Approximate Reasoning, vol86, pp. 73-79.
  • H. Rakhshani, A. Rahati, Intelligent multiple search strategy cuckoo algorithm for numerical and engineering optimizationproblems, Arabian Journal for Science and Engineering, 2017, vol 42.2, pp. 567-593.
  • H. Rakhshani, E. Dehghanian, A. Rahati, Hierarchy cuckoo search algorithm for parameter estimation in biological systems,Chemometrics and Intelligent Laboratory Systems, 2016, vol 159, pp. 97-107.