Python programming language is a popular programming language in the field of data analysis. In this research, researchers will analyze data on the cause of lung cancer, because lung cancer has the highest death case among other cancer types. The purpose of the research is to explore how dataset processing related to lung cancer-causing factors using the Python programming language and the datasets used are derived from Kaggle. The research method used is quantitative with a population of 462,000 people and a sample of 999 people. The analysis technique used is an explorative data analysis technique. From the results of the research, the highest cause of lung cancer was found in people aged 32-42 years, level 7 genetic factor, 7th level in dust allergies, level 7 obesity and male sex has higher risk than women.