Announcing Learn Python the Hard Way's Next Edition
Announcing Learn Python the Hard Way's Next Edition
Did you know when you sign a contract with a publisher you have to update your books? Neither did I! I'm mostly joking but I've had enough demands and complaints from readers of Learn Python the Hard Way that it was time for an update, but I was too deep in JavaScript land to have bandwidth for it. Then last month my Publisher started bothering me for updates as well, so now I'm on the hook for a new edition.
I was reluctant to work on anything new related to Python due its stagnation in the web development space, but a few recent events have changed my mind: Codon and the popularity of Data Science.
Codon
I'm really excited about Codon and I'll be playing with it in the near future. I have a couple fun projects in mind that specifically leverage Codon's abilities, and I'll hopefully have a few articles about Codon in practice. Mostly I'm interested in how Codon compiles Python, and it's ability to interface with C fairly easily. It also seems to be really well designed and apparently it can embed the cpython interpreter for those cases where you absolutely have to run Python.
Here's their example showing the @python
decorator embedding the Python interpreter when you need it:
@python
def scipy_eigenvalues(i: List[List[float]]) -> List[float]:
# Code within this block is executed by the Python interpreter,
# so it must be valid Python code.
import scipy.linalg
import numpy as np
data = np.array(i)
eigenvalues, _ = scipy.linalg.eig(data)
return list(eigenvalues)
print(scipy_eigenvalues([[1.0, 2.0], [3.0, 4.0]])) # [-0.372281, 5.37228]
What's amazing about this design is it's combined with very a easy C FFI interface due to Codon's use of LLVM as the backend:
from C import pow(float, float) -> float
pow(2.0, 2.0) # 4.0
# Import and rename function
# cobj is a C pointer (void*, char*, etc.)
# None can be used to represent C's void
from C import puts(cobj) -> None as print_line
print_line("hello".c_str()) # prints "hello"; c_str() converts Codon str to C string
You can even inline the LLVM IR directly in your code for the rare cases when the compiler needs a little help:
@llvm
def popcnt(n: int) -> int:
declare i64 @llvm.ctpop.i64(i64)
%0 = call i64 @llvm.ctpop.i64(i64 %n)
ret i64 %0
print(popcnt(42)) # 3
I have a few projects in mind that could use this in the future, but I will need to fully review it and I do have some reservations about its license. More on that later.
Python is Data Science Now
Codon is awesome, and it's definitely getting me interested in Python again, but the real winner in the Python world is Data Science. Right now AI, Data Science, and Machine Learning are hot, and they're the primary thing Python is being used for. I think most of the students who contact me wanting to learn Python are interested in the world of Data Science and not web development or "backend" programming. I think languages like Go, Rust, and JavaScript have largely supplanted Python for general systems programming, and there's some evidence from Github that shows this trend.
Here's a list of the top 20 Python projects on Github by stars. Do you notice something?
Name | Stars | Category |
---|---|---|
public-apis | 241137 | Scraping |
system-design-primer | 220942 | Systems |
awesome-python | 169124 | Education |
TheAlgorithms/Python | 159025 | Education |
Python-100-Days | 136201 | Education |
Auto-GPT | 135364 | ML/DS |
youtube-dl | 120574 | Scraping |
transformers | 101767 | ML/DS |
stable-diffusion-webui | 78293 | ML/DS |
thefuck | 77571 | Systems |
django | 71010 | Web |
HelloGitHub | 69243 | Education |
pytorch | 67235 | ML/DS |
flask | 63061 | Web |
home-assistant/core | 60641 | Systems |
awesome-machine-learning | 58946 | ML/DS |
keras | 58428 | ML/DS |
fastapi | 58363 | Web |
ansible | 57471 | Systems |
scikit-learn | 54360 | ML/DS |
cpython | 53404 | Python |
manim | 51485 | Graphing |
funNLP | 50741 | ML/DS |
requests | 49698 | Scraping |
face_recognition | 48357 | ML/DS |
yt-dlp | 47975 | Scraping |
PayloadsAllTheThings | 47941 | Security |
you-get | 47412 | Scraping |
scrapy | 47303 | Scraping |
localstack | 47235 | Systems |
If we count the projects by their categories we have the following breakdown:
Project Type | Count |
---|---|
ML/DS | 9 |
Scraping | 6 |
Education | 4 |
Systems | 4 |
Web | 3 |
Security | 1 |
Graphing | 1 |
Python | 1 |
It's almost entirely data science projects, especially if you consider things like Graphing and Scraping being something primarily used in Data Science. If you do that then 80% of the top most popular projects on Github are related to Data Science. This fits with the wild success of Data Science, AI, and Machine Learning in the last five years, and the relative lack of innovation in Python's other use cases such as web development and systems management.
Now, if you think this isn't a fair analysis of popularity I want to stress that everyone is also quoting this as a measure of Python's general popularity. You aren't allowed to rave about Python climbing to the top of the Github stars chart and then balk at the suggestion that, actually, it's Data Science that's popular. Either stars are meaningless and Python's not popular, or stars are important and Python Data Science is popular.
The Master Plan
Learn Python the Hard Way has always been focused on Pre-Beginners in that it assumes nothing and aims at building the knowledge someone needs to eventually learn the topic. My approach is not to teach someone to be a master of the subject, but to teach them all the things other writers assume "beginners" already know. If you've ever read a book that starts with print("Hello World")
then jumps to "a monad is just a monoid in the category of endorfunctors" then my book teaches you what that author assumes you know.
Focusing on Data Science in my style means that I won't teach you the entire world of Data Science, since that's already covered by many more qualified people than me. My goal in the new Learn Python the Hard Way is to teach you everything about Python programming that those courses assume you already know. When you're done with my book you'll have the skills you need to then understand other books.
A secondary goal in the new book is to get you familiar with the basic tools used in Data Science, like Jupyter, Pandas, Anaconda, and low level topics like data munging, testing, and graphing. I won't go extremely deep into these topics, but having a familiarity with them will make other books easier to understand.
Finally, I'm going to target the new book at a secondary audience of people who are knowledgeable of Data Science, but maybe they feel their Python skills are lacking. This would be anyone who has impostor syndrome when they write Python code and who wants to feel more confident in their basic Python knowledge. I want to "upgrade" people from strictly using Jupyter to creating full Python projects with automated testing for repeatable results in addition to detailed explanations of basic Python topics.
The Outline Thus Far
I've submitted the following outline to my publisher, but I'll be changing this as I work through the exercises using Jupyter. Remember that the goal of this course is not to craft a grand master of Python Data Science, but to teach a Pre-Beginner the basics of Python most other books assume you have.
First I start off with the usual first set of exercises to get people into controlling a computer with language, but I'll be using Anaconda and Jupyter exclusively to get people started.
- Exercise 0: Gearing Up
- Exercise 1: A Good First Program
- Exercise 2: Comments and Pound Characters
- Exercise 3: Numbers and Math
- Exercise 4: Variables and Names
- Exercise 5: More Variables and Printing
- Exercise 6: Strings and Text
- Exercise 7: Combining Strings
- Exercise 8: Formatting Strings Manually
- Exercise 9: Multi-line Strings
- Exercise 10: Escape Codes in Strings
Then I move on to simple I/O but focused on how to use Jupyter to create the files and open them. It's at this point that I'll start "weening" people off Jupyter and start making little scripts using a simple external text editor. This will help when they want to move their work into an external project to share, or start adding more traditional Python resources such as automated testing, deployment, and package sharing.
- Exercise 11: Asking People Questions
- Exercise 12: An Easier Way to Prompt
- Exercise 13: Parameters, Unpacking, Variables
- Exercise 14: Prompting and Passing
- Exercise 15: Reading Files
- Exercise 16: Reading and Writing Files
- Exercise 17: More Files
It's at this point I can start introducing simple functional programming and data structures. There's some people who hang out on Stack Overflow yelling at beginners that think you should start with OOP right away, but there's a significant problem with this belief:
You can construct all of Object Oriented Programming from just functions and dicts. You can't construct functions and dicts from objects and classes without first explaining functions and dicts.
With that in mind I'll teach functions and functional programming first so that later I can show them how to build their own Object Oriented System from first principles.
- Exercise 18: Names, Variables, Code, Functions
- Exercise 19: The Concept of Jumps
- Exercise 20: Functions and Variables
- Exercise 21: Functions and Files
- Exercise 22: Functions Can Return Something
With functions covered I can then get into deeper into strings and the basics of simple data types:
- Exercise 23: Strings, Bytes, and Character Encodings
- Exercise 24: Introductory Lists
- Exercise 25: Introductory Dictionaries
- Exercise 26: Lists and Dictionaries
After learning an introductory level of these basic data structures, and the previous information on jumps and functions, it's time to get into boolean logic, loops, and if-statements
. Once again, if you know about jumps, and you know about boolean tests, then you can understand if-statements
. If you understand jumps and if-statements
then you can figure out basic looping. After that it's a process of combining data structures with more advanced loops like for-loops
:
- Exercise 27: Memorizing Logic
- Exercise 28: Boolean Practice
- Exercise 29: What If
- Exercise 30: Else and If
- Exercise 31: Making Decisions
- Exercise 32: Loops and Lists
- Exercise 33: While Loops
- Exercise 34: Accessing Elements of Lists
- Exercise 35: Branches and Functions
It's at this point that I've taught the fundamental parts of how programming works, so everything after this is either practicing those concepts or adding on concepts that use those fundamentals.
- Exercise 36: Designing and Debugging
- Exercise 37: Symbol Review
- Exercise 38: Doing Things to Lists
- Exercise 39: Doing Things to Dictionaries
Object Oriented Programming is an example of something that's far easier to teach once someone knows about dict
and functions, so we get into this here. In the past I tried to "sneak" in an understanding of OOP with a weird method, but my JavaScript course has taught me that it's easier to teach people how to build their own basic OOP system with dict
and closures, then show how that "maps" to the built-in OOP of the language:
- Exercise 40: From Dictionaries to Objects
- Exercise 41: Basic Object Oriented Programming
- Exercise 42: Inheritance and Advanced OOP
Once they reach this point they're probably ready to move off Jupyter and learn how to create a regular Python project with automated testing. This will cover more traditional developer tools, and I might throw in an exercise that has a CLI crash course right here rather than as an appendix.
- Exercise 43: Graduating from Jupyter
- Exercise 44: Setting up Developer Tools
- Exercise 45: Managing Packages in Anaconda
- Exercise 46: A Project Skeleton
- Exercise 47: Automated Testing
Finally, this is a book about getting someone ready to study other Data Science books, so I'll spend the final exercises lightly touching on various data science topics. Things like Data Munging, DataFrames, Graphing and simple analysis. I might add in a bit of SQL but I'm not sure if I could cover enough SQL in a few exercises to be useful.
- Exercise 48: What is Data Munging?
- Exercise 49: Scraping Data from the Web
- Exercise 50: Getting Data from APIs
- Exercise 51: Munging Data Manually
- Exercise 52: Munging Data with Pandas
- Exercise 53: Pandas Dataframes in Depth
- Exercise 54: Data Munging Project 1: TBD
- Exercise 55: Graphing Data with Matplotlib
- Exercise 56: Basic Statistics with NumPy
- Exercise 57: Statistics with SciPy
- Exercise 58: Data Analysis Project 1: TBD
- Exercise 59: Data Analysis Project 2: TBD
- Exercise 61: Final Words
That's the plan so far. If you have feedback on this list of topics based on what you do as a Data Scientist then feel free to contact me @lzsthw on Twitter. My only warning is, if you're looking to get me to teach people that one thing you found annoying at your last job or to turn them into Python true believers, then don't bother. I don't indoctrinate people. I create independent learners who question what they learn and form their own opinions.
Price Increase and Upgrades
Inflation is kicking everyone's grapes and I'm no different, so the price on the finished course will be $59 going forward. However, I will offer an upgrade price for the difference if you already bought my previous version, and I'll give a free upgrade to anyone who buys (or has bought) the current version of Learn Python the Hard Way after April 2023.
This means if you bought it April 30th, 2023 you can pay $20 for an upgrade. If you bought it after May 1, 2023 you'll get a free upgrade. You'll also get early access to the content as I work on it and access to my Discord for help and feedback just like with Learn JavaScript the Hard Way.
More from Learn Code the Hard Way
Announcing _Learn Python the Hard Way_'s Next Edition
Announcing the new version of _Learn Python the Hard Way_ which will be entirely focused on Pre-Beginner Data Science and not web development.
Ten Reasons Youtube's Streaming is Awful
I did a test of Youtube and its streaming has tons of problems. Here's 10 reasons why Youtube's streaming is mostly pointless when compared to Twitch. I'll use Twitch for streaming, then post to youtube.
SPA vs. MPA, FIGHT!
Getting realistic about Single-Page vs. Multi-Page applications.
How to Create Your Own `npm init` and Get Off npmjs.com
After struggling with npm init I figured out a way to avoid it entirely that ends up being easier.