Lilia Starostina

Data Scientist

View My GitHub Profile

Data Scientist

About me

Hi! My name is Lilia Starostina, and I am a budding Data Scientist.

I hold a diploma in Electrical Engineering and have built a diverse career spanning teaching and scientific research at universities, as well as engineering roles in the mining industry. In recent years, I discovered my passion for Data Science, completed specialized training in this field, and decided to focus on data analysis as the next step in my professional journey. I am now open to collaboration on data science projects, particularly those involving Natural Language Processing (NLP) and Time Series Analysis, and actively seeking opportunities to apply my skills in this dynamic field!

Stack

SQL, Python, библиотеки: Numpy, Pandas, SciPy, Statsmodels, NLTK, Re, Pymorphy2, Matplotlib, Seaborn, Sklearn, Pymystem3, Genism, Pytorch, Keras, Cv2, PIL, Pyspark, Surprise, Requests, BeautifulSoup, Time.

Core Competencies

• Proficient in Python for data analysis and machine learning;

• Skilled in SQL for data retrieval and manipulation;

• Experienced in big data technologies and frameworks (e.g., PySpark);

• Expertise in creating and training neural networks;

• Competent in working with text, image, and time series data;

• Strong foundation in classical machine learning models and feature engineering.

I possess the following skills:

• Selecting and implementing classical machine learning algorithms tailored to specific tasks;

• Performing feature selection and feature engineering for machine learning models;

• Developing recommendation systems using collaborative filtering and content-based methods;

• Solving Computer Vision tasks, such as image classification, segmentation, and object detection;

• Training language models and working with attention mechanisms.

Education

• Data Scientist (2025);

• Engineers degree in Electric drive and automation (2006);

Certificates

Data science projects

Computer Vision

Segmentation of objects in an image

Blood cell image detection

Image Classification of the “Cats vs Dogs” Dataset

Quality improving of a NN training for image classification

Handwriting recognition using the MNIST database

Natural Language Processing

Topic modeling and sentiment classification of reviews based on classical ML algorithms

Translation of phrases using the attention mechanism

SQL

SQL queries for air transportation data analysis (PostgreSQL)

Time series analysis

Analysis of stationary and non-stationary time series

Big Data with PySpark

Logistic Regression Model for Iris Flower Classification Using PoSpark

Other projects:

Management of data science projects

Recommender systems

Statistics with Python

Web-scraping

Regular expressions

Classical machine learning

My contacts

• l.v.starostina2014@gmail.com

Resume