Data Scientist
Hi! My name is Lilia Starostina, and I am a budding Data Scientist.
I hold a diploma in Electrical Engineering and have built a diverse career spanning teaching and scientific research at universities, as well as engineering roles in the mining industry. In recent years, I discovered my passion for Data Science, completed specialized training in this field, and decided to focus on data analysis as the next step in my professional journey. I am now open to collaboration on data science projects, particularly those involving Natural Language Processing (NLP) and Time Series Analysis, and actively seeking opportunities to apply my skills in this dynamic field!
SQL, Python, библиотеки: Numpy, Pandas, SciPy, Statsmodels, NLTK, Re, Pymorphy2, Matplotlib, Seaborn, Sklearn, Pymystem3, Genism, Pytorch, Keras, Cv2, PIL, Pyspark, Surprise, Requests, BeautifulSoup, Time.
• Proficient in Python for data analysis and machine learning;
• Skilled in SQL for data retrieval and manipulation;
• Experienced in big data technologies and frameworks (e.g., PySpark);
• Expertise in creating and training neural networks;
• Competent in working with text, image, and time series data;
• Strong foundation in classical machine learning models and feature engineering.
• Selecting and implementing classical machine learning algorithms tailored to specific tasks;
• Performing feature selection and feature engineering for machine learning models;
• Developing recommendation systems using collaborative filtering and content-based methods;
• Solving Computer Vision tasks, such as image classification, segmentation, and object detection;
• Training language models and working with attention mechanisms.
• Data Scientist (2025);
• Engineers degree in Electric drive and automation (2006);
Computer Vision
• Segmentation of objects in an image
• Image Classification of the “Cats vs Dogs” Dataset
• Quality improving of a NN training for image classification
• Handwriting recognition using the MNIST database
Natural Language Processing
• Topic modeling and sentiment classification of reviews based on classical ML algorithms
• Translation of phrases using the attention mechanism
SQL
• SQL queries for air transportation data analysis (PostgreSQL)
Time series analysis
• Analysis of stationary and non-stationary time series
Big Data with PySpark
• Logistic Regression Model for Iris Flower Classification Using PoSpark
Other projects:
• Management of data science projects
• l.v.starostina2014@gmail.com
• Resume