Python 101

For a data scientist, learning to code is crucial. There are quite a few programming languages out there and knowing which one to start with can seem overwhelming for a beginner. I suggest you start with Python. Python is one of the most popular programming languages used in data analysis, machine learning, and AI. Python was created by Guido van Rossum in 1991. And no, it was not named after a snake. It was actually named after the British comedy group, Monty Python’s Flying Circus. Since it’s release, it has undergone a few major revisions with Python 3, the most recent and possibly the final large revision being released in 2008. At the time this blog post was written, Python 3.11 is the most current update to Python 3.

Why Learn Python?


Alright, why should you learn Python, you ask? Well, let me break it down for you. While other languages like R and Julia have their perks, Python is like the holy-grail (see what I did there?) of the data science world. R is great for complex statistical analysis, especially in academia, but it can be a bit harder to learn as a beginner. Julia is gaining popularity for its lightning-fast processing speed, but it’s still catching up in terms of available libraries. Python, on the other hand, is pretty beginner-friendly and has a massive library ecosystem. There are incredible packages for machine learning and AI modeling that are specifically built for Python. It’s basically the go-to language for industry data scientists, and my top choice for getting started in the field.

How To Get Started


Alright, now that you’re convinced that learning Python is the way to go, let’s talk about how you can get started on your Python journey.

  1. Install Python: The first thing you need to do is install Python on your computer. Python is available for all major operating systems, including Windows, macOS, and Linux. You can download the latest version of Python from the official Python website (https://www.python.org). Make sure to choose the appropriate version for your operating system.
  2. Choose an Integrated Development Environment (IDE): An IDE is a software application that provides tools and features to help you write, debug, and run your Python code. There are several popular IDEs available for Python, such as PyCharm, Visual Studio Code, and Jupyter Notebook. Choose the one that suits your needs and preferences. Personally, I recommend starting with Jupyter Notebook, as it provides an interactive and beginner-friendly environment.
  3. Learn the Basics: Once you have Python installed and an IDE set up, it’s time to start learning the basics of the language. There are numerous online resources available to learn Python, including tutorials, videos, and interactive coding platforms. I started with a free python course from Harvard’s CS50. Then I did the free data science prep-work on the Flatiron School website. You can also find free tutorials and documentation on the official Python website.
  4. Practice, Practice, Practice: Learning to code is all about practice. As you learn new concepts and features of Python, make sure to apply them by writing your own code. You can follow free project tutorials on Youtube or do homework exercises from the online course you take such as CS50.
  5. Explore Python Libraries: One of the great advantages of Python is its extensive library ecosystem. There are libraries available for almost every task you can think of, from data manipulation and analysis to machine learning and visualization. Some popular Python libraries for data science include NumPy, Pandas, Matplotlib, and scikit-learn. Take the time to explore these libraries. Try adding what you’ve discovered to new projects.
  6. Join the Python Community: Python has a vibrant and supportive community of developers and data scientists. Try to participate in online forums, attend local meetups or conferences, and follow influential Python developers on social media. I attended a couple large online conferences this year and was introduced to a lot of interesting programs. You’ll be amazed at how much you can learn and grow by being an active part of the community.

Wrapping Up


Learning Python is a great investment if you’re interested in data science, machine learning, or AI. Its beginner-friendly syntax, vast library ecosystem, and widespread industry adoption make it the go-to language for data scientists. Just remember to be patient. It takes time to learn a whole new language, but with practice, you’ll be coding in no time!

Leave a comment