An artist with his canvas and paintbrush. A musician with this violin and bow. A tennis player with a racket and a ball. A writer with a paper and pencil. A chef with a pan and spoon. In order for a person to fulfill a task, they often need to use more than one thing.
Otherwise, you either won't be able to finish it completely, won't be able to see it to its full potential, or in some cases, can't even accomplish it at all.
When we're coming to programming, other than knowing the language (specifically Python here) itself, there are so many tools that you can use to supplement your learning and execute more effectively.
Tools in Python mostly fall into three main categories:
Although you might spend the most time inside an Integrated Development Environment (IDE), organizing your projects more efficiently and knowing what you can connect and build beyond them is the best way to become a more adept and skilled programmer.
Below, we compiled some of the best IDEs, packages, and extraneous tools you could use according to the Python developer community.
The Best IDEs
Before we start delving to all the options, here is a quick definition or let's say minimum requirements for an IDE:
An IDE (Integrated Development Environment) is an all-in-one, go-to interface for software development. Equipped with all the necessary tool to create a program, a developer's first step begins with using an IDE. They usually contain:
- a text editor
- debugging capabilities
- version control mechanisms
- and multiple other external connections and supporting utilities depending on interface
Some of the most popular ones out there:
- PyCharm by Jet Brains
PyCharm can handle everything– web development, data science computation, AI/ML algorithms and etc. Combining its free and paid versions, PyCharm users have access to a seamless code editor, intelligent code inspectors/debuggers (it has one of the most advanced debuggers through static and runtime code analysis), test runners, database/scientific tools, web framework support and more.
Something extra special about PyCharm is its versatile plug-in library where you can expand its functionalities even more.
Visual Studio Code was a product of Microsoft's tech ecosystem. Like other interfaces from its place of origin, one of its biggest plus points is its high-end user experience. Its actual code editor is quite excellent, with special features such as semantic Intellisense code completion and extensive code refactoring.
But one of the coolest things about Visual Studio Code is its Live Share apparatus, which allows you to pair-programming with someone else integrated into the cloud environment.
PS - Visual Studio Code or VSCode is my favorite IDE :)
Spyder is a great beginner-level IDE, along with PyCharm. (I actually used both of them in my intro college classes.) What makes Spyder stand out is definitely its easy, clear user interface. Along with an editor and directory like all other IDEs, Spyder provides a variable explorer and a multi-functional console (that outputs graphs and visualizations as well). It's an iPython console makes it a great way to conduct data science and machine learning programs and conveniently view and navigate your results.
Jupyter Notebook should be a key addition to every data scientist's toolbox. It's based on a web-application on a server-client structure, and facilitates the use of other web tools and outputs. Notebook documents that contain live code and can easily run equations, data cleaning, data visualization/transformations, statistical modeling and more. The live code can be converted to markdowns, supporting HTML content too, allowing to be shared and reproduced easily.
The Best Python Packages (organized by use)
Data Science (at least the early stages of it)
For the foundational levels of data science, there are two main steps: 1) cleaning and analyzing data and 2) visualizing data.
Step 1: Data Analysis
Pandas is one of the most popular and iconic Python packages. Straight from its website, pandas is a "fundamental high-level building block for doing practical, real world data analysis in Python." As a foundational step, it's great for analysis and manipulation for large data sets. It's token Data-Frame structure, TimeSeries objects, and group-by and other SQL-esque commands make it easy to learn, flexible, and extremely useful.
This is a Pandas DataFrame run in a Jupyter Notebook:
NumPy is powerful and fast, fast, fast. Primarily used for mathematical computation in arrays, its operations extend to trigonometric, statistical, and algebraic routines. Within arrays, you can go from sorting and transposing all the way to sin and cosine functions. Other data types exist in NumPy but it is the place to go to realize the potential of array computing, which is a great vehicle to store, manipulate, and analyze data.
Step 2: Data Visualization
Matplotlib at its core is a two-dimensional plotting library that allows you to create most types of charts (plots, histograms, bar graphs, error charts etc.) with just a few lines of code. It's pretty extensible as it comes with other graphical user interface components and can be used in other iPython shells and web application servers.
Seaborn is basically a cooler and fancier version of Matplotlib. Based of the former package, Seaborn comes with more high-end functionality to create more informative and attractive statistical visualizations. It requires less syntax in most cases and has some stunning default themes. It also make some of the more advanced dataviz requirements easier to do (like mapping a color to a variable or faceting).
A preview of some of the amazing graphs you can make with Seaborn:
Plotly is a great exploratory dataviz package and is capable of analyzing and displaying statistical, scientific, financial and geographic data. Plotly is similar to Matplotlib in that it allows for straightforward and quick syntactical process to create visualizations. However, Plotly can make more elaborate figures and include elegant interactivity and modification elements. It works great in conjunction with the Pandas package, making it easier to adjust your data and narrow in on specific elements in your visualization.
Scikit-Learn is probably where you want to start as a beginner in ML. It's a straightforward but comprehensive tool for data science, that bundles many of the possible models you could run in ML. A brief overview of what you could do with Scikit: classification, logistic regression, clustering, Gaussian models, decision trees, K Nearest Neighbors.
Pro Tip: The Scikit-Learn website has extremely clear and detailed user guides on how to tackle any of these data science models. It's also used by Spotify.
Developed by Google, TensorFlow has widespread recognition in the ML community. It's in an end-to-end, all inclusive ecosystem for machine learning applications. It has created to easily deploy ML models and translate your work to other users. Some of the things you can do with TensorFlow is run deep neural networks for handwriting classification, image recognition, natural language processing, and more. One of its greatest benefits is its abstraction of the high-level computation, allowing a user-friendly interface to delve into ML.
PyTorch is an open source AI/ML library that specializes in neural network based deep learning models, for applications such as computer vision and long short-term memory and other network architectures. Although this jargon might seem a bit complicated, it's been adapted to be very user-friendly. It utilizes basic, well-known Python concepts like classes, data structures and conditional loops, making it easy to learn and build something ourselves. The learning curve for PyTorch is surprisingly short, allowing the package creators to boast its rapid cycle from research to actual product deployment.
For Building Websites and APIs
Flask is better known as a "microframework" because it's the gateway to building web applications as beginners. It includes all the essential components like templates, request/response objects, and URL routing. But if you want it to take it to the next level, Flask allows for more complex functionality like database access or form generation and validation through extensions. Flask offers recommendations and suggestions as you develop your app but doesn't enforce any strict dependencies or layout. It is completely up to you to choose what and how to build.
Django is a full-stack Python Web framework that allows you to rapidly build an elegant, functional web application. Most web applications in Python are built out with Django too. It comes with a proven toolkit of modules, attributes, and methods, allowing you to build apps quickly and use modern, practical coding practices while doing so. It's difficult to summarize what Django can make possible but we can definitely say that it's easy-to-use, fast, well-documented, secure and can work in almost every programming context.
Web Scraping Tools
Built to save programmers hours and days of work, BeautifulSoup is an efficiently designed Python library for extracting data out of HTML and XML files. It can complement your go-to parser and gives you ways to modify and guide your parse tree. One particular plus point of BeautifulSoup is that it can automatically detect encoded characters, allowing for an elegant handling of documents.
Selenium is a tool for meant for web-browser automation and testing. While it works with most major programming languages, Selenium Python has a straightforward API that comes with bindings to handle web scraping using the tool's WebDriver. Fetching the web page directly, it can detect and interact with HTML elements. But its capabilities extend to other things like handling dynamic names and switching browser windows.
I could say that Scrapy emulates the qualities of its sorta namesake, the word scrappy, because it's powerful, determined and resourceful. Scrapy is a complete web crawling and scraping framework and it can really do it all– from testing to monitoring web pages to extracting data. Scrapy users can define Spiders (classes to scrape data) and crawl through a website (or group of websites). Scrapy is the vehicle for building more complex and sophisticated data pipelines.
This is how the process of using a Scrapy Web Crawler works:
While there is no requirement for you to start incorporating all the Python tools mentioned above, it's useful to be aware of the possible instruments out there so you can adjust and improve your programming etiquette accordingly. Often, these tools are used in conjunction rather than singularly. So know that's alright to start small at first because they'll help you build the big things in the future.
If you're looking for a place to get exposure to some of these tools and environments, check out some of the free Python projects on TheCodex and take your pick from building with things like BeautifulSoup, Selenium, PyCharm, Matplotlib, Flask and more.
Hi– I'm Sreya! Thanks for reading! :)