Learning HTTP

author James Bunton <jamesbunton@delx.au>

Mon, 1 Jan 2024 07:52:04 +0000 (18:52 +1100)

committer James Bunton <jamesbunton@delx.au>

Mon, 1 Jan 2024 09:27:07 +0000 (20:27 +1100)
author James Bunton <jamesbunton@delx.au>
Mon, 1 Jan 2024 07:52:04 +0000 (18:52 +1100)
committer James Bunton <jamesbunton@delx.au>
Mon, 1 Jan 2024 09:27:07 +0000 (20:27 +1100)
diff --git a/Makefile b/Makefile

new file mode 100644 (file)

index 0000000..29e0321
--- /dev/null
+++ b/Makefile
@@ -0,0 +1,11 @@
+MARKDOWN := python3 -mmarkdown -x markdown.extensions.fenced_code -x markdown.extensions.nl2br
+
+all: $(addsuffix .html, $(basename $(wildcard *.md)))
+
+%.html: %.md
+       $(MARKDOWN) < $< > $@
+
+clean:
+       rm -f *.html
+
+.PHONY: clean
diff --git a/exercise2.py b/exercise2.py

new file mode 100644 (file)

index 0000000..f77f610
--- /dev/null
+++ b/exercise2.py
@@ -0,0 +1,40 @@
+#!/usr/bin/python
+
+import http.server
+
+COUNTER = 0
+
+def main():
+    listen_address = ('localhost', 8000)
+    request_handler = MyRequestHandler
+    server = http.server.HTTPServer(listen_address, request_handler)
+    server.serve_forever()
+
+class MyRequestHandler(http.server.BaseHTTPRequestHandler):
+    def do_GET(self):
+        if not self.path.endswith('.html'):
+            self.send_response(404)
+            self.end_headers()
+            self.write('File not found')
+            return
+
+        global COUNTER
+        self.send_response(200)
+        self.send_header('Content-type', 'text/html')
+        self.end_headers()
+        self.write('<html>')
+        self.write('<head><title>My web server!</title></head>')
+        self.write('<body>')
+        self.write('Hi there!<br>')
+        self.write('You requested: ' + self.path + '<br>')
+        self.write('You are using this client: ' + self.headers.get('user-agent') + '<br>')
+        self.write(f'We have had <b>{COUNTER}</b> visitors today')
+        COUNTER = COUNTER + 1
+        self.write('</body>')
+        self.write('</html>')
+
+    def write(self, text):
+        self.wfile.write(text.encode('utf-8'))
+
+if __name__ == '__main__':
+    main()
diff --git a/lesson1_HTTP.html b/lesson1_HTTP.html

new file mode 100644 (file)

index 0000000..5804e6d
--- /dev/null
+++ b/lesson1_HTTP.html
@@ -0,0 +1,47 @@
+<h1>Lesson 1 HTTP</h1>
+<h2>References:</h2>
+<ul>
+<li>https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview</li>
+<li>https://www.rfc-editor.org/rfc/rfc2616</li>
+<li>https://curl.se/docs/manpage.html</li>
+</ul>
+<h2>Overview</h2>
+<p>In the 1991 HTTP was created. Web browsers used HTTP to fetch static HTML, text and images from web servers. These static files were sitting in a directory on a server. The only way to change the website was to update these files.</p>
+<p>You could type a URL like <code>http://acme.com/products.html</code> into a graphical web browser, the browser would then speak HTTP to the <code>acme.com</code> server to fetch the <code>/products.html</code> file. The user may then click a <code>&lt;a href="http://contoso.net/about.html"&gt;</code> to cause the browser to visit a different site, speaking HTTP to the <code>contoso.net</code> server to fetch the <code>/about.html</code> file.</p>
+<h2>Learning objective</h2>
+<p>The goal is to learn about HTTP, specifically what it does and what it looks like.<br />
+- What is an HTTP method?<br />
+- What is an HTTP status code?<br />
+- Identify the components of a URL, server, path.<br />
+- How does an HTTP server use the server and path from the URL?</p>
+<h2>Exercises</h2>
+<h3>Run a simple web server using Python</h3>
+<ul>
+<li>Make new directory with some text or html files, at least two. Eg demo.html and hello.txt.</li>
+<li>Open terminal in that directory. Right click, open in terminal.</li>
+<li>Start the web server: <code>python -mhttp.server</code></li>
+<li>Keep this terminal window open for later.</li>
+</ul>
+<h3>Access the web server with a browser like Firefox</h3>
+<ul>
+<li>Open web browser and visit: http://localhost:8000</li>
+<li>See what is happening in the web server terminal window.</li>
+<li>You can click on the files to view them.</li>
+<li>The browser is acting as an HTTP client to view the files being served by the Python HTTP server.</li>
+</ul>
+<h3>Access the web server with a command line client, curl</h3>
+<ul>
+<li>Open a new terminal window in addition to the Python web server one.</li>
+<li>Run <code>curl http://localhost:8000</code> to see the directory index generated by the web server.</li>
+<li>Run <code>curl http://localhost:8000/demo.html</code> to see a file you created in the web server directory.</li>
+<li>Try variations of this like: <code>/hello.txt</code>, or <code>/missing.txt</code></li>
+<li>You can also try this on other web servers, like <code>https://www.google.com</code></li>
+<li>Try in verbose mode to see the raw HTTP: <code>curl -v http://localhost:8000</code></li>
+</ul>
+<h3>Access the web server by typing raw HTTP</h3>
+<ul>
+<li>Run command <code>nc localhost 8000</code></li>
+<li>Enter <code>GET / HTTP/1.0</code>, then press return twice</li>
+<li>Try with <code>GET /hello.txt</code> or other variations to see how the server responds.</li>
+</ul>
+<p>HTTP is just text! When you click a link, all that Firefox does is send this text to the server and display the result on the screen.</p>
+\ No newline at end of file
diff --git a/lesson1_HTTP.md b/lesson1_HTTP.md

new file mode 100644 (file)

index 0000000..9334e2d
--- /dev/null
+++ b/lesson1_HTTP.md
@@ -0,0 +1,49 @@
+# Lesson 1 HTTP
+
+## References:
+- https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview
+- https://www.rfc-editor.org/rfc/rfc2616
+- https://curl.se/docs/manpage.html
+
+## Overview
+
+In the 1991 HTTP was created. Web browsers used HTTP to fetch static HTML, text and images from web servers. These static files were sitting in a directory on a server. The only way to change the website was to update these files.
+
+You could type a URL like `http://acme.com/products.html` into a graphical web browser, the browser would then speak HTTP to the `acme.com` server to fetch the `/products.html` file. The user may then click a `<a href="http://contoso.net/about.html">` to cause the browser to visit a different site, speaking HTTP to the `contoso.net` server to fetch the `/about.html` file.
+
+## Learning objective
+
+The goal is to learn about HTTP, specifically what it does and what it looks like.
+- What is an HTTP method?
+- What is an HTTP status code?
+- Identify the components of a URL, server, path.
+- How does an HTTP server use the server and path from the URL?
+
+## Exercises
+
+### Run a simple web server using Python
+- Make new directory with some text or html files, at least two. Eg demo.html and hello.txt.
+- Open terminal in that directory. Right click, open in terminal.
+- Start the web server: `python -mhttp.server`
+- Keep this terminal window open for later.
+
+### Access the web server with a browser like Firefox
+- Open web browser and visit: http://localhost:8000
+- See what is happening in the web server terminal window.
+- You can click on the files to view them.
+- The browser is acting as an HTTP client to view the files being served by the Python HTTP server.
+
+### Access the web server with a command line client, curl
+- Open a new terminal window in addition to the Python web server one.
+- Run `curl http://localhost:8000` to see the directory index generated by the web server.
+- Run `curl http://localhost:8000/demo.html` to see a file you created in the web server directory.
+- Try variations of this like: `/hello.txt`, or `/missing.txt`
+- You can also try this on other web servers, like `https://www.google.com`
+- Try in verbose mode to see the raw HTTP: `curl -v http://localhost:8000`
+
+### Access the web server by typing raw HTTP
+- Run command `nc localhost 8000`
+- Enter `GET / HTTP/1.0`, then press return twice
+- Try with `GET /hello.txt` or other variations to see how the server responds.
+
+HTTP is just text! When you click a link, all that Firefox does is send this text to the server and display the result on the screen.
diff --git a/lesson2_dynamic-web.html b/lesson2_dynamic-web.html

new file mode 100644 (file)

index 0000000..6e0735a
--- /dev/null
+++ b/lesson2_dynamic-web.html
@@ -0,0 +1,82 @@
+<h1>Lesson 2 the dynamic web</h1>
+<h2>References:</h2>
+<ul>
+<li>https://docs.python.org/3</li>
+<li>https://docs.python.org/3/library/http.html</li>
+<li>https://github.com/python/cpython/blob/3.12/Lib/http/server.py</li>
+<li>https://docs.python.org/3/tutorial</li>
+<li>https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals</li>
+<li>https://developer.mozilla.org/en-US/docs/Glossary/Favicon</li>
+</ul>
+<h2>Overview</h2>
+<p>In the mid to late 1990s the web started becoming dynamic. Web servers might return different content to different users who requested the same URL. A common example of this was a hit counter which would increment each time any user visited the website, or a guest book to allow users to leave messages.</p>
+<h2>Learning objective</h2>
+<p>The goal is to build a simple dynamic web server.<br />
+- Understand how the HTTP method and path map to a Python function call.<br />
+- Build a web server that returns dynamically generated content.</p>
+<h2>Exercises</h2>
+<h3>Build a simple HTTP server in Python</h3>
+<p>Create a file, <code>ex2.py</code> and paste the following into it:</p>
+<pre><code>import http.server
+
+def main():
+    listen_address = ('localhost', 8000)
+    request_handler = http.server.SimpleHTTPRequestHandler
+    server = http.server.HTTPServer(listen_address, request_handler)
+    server.serve_forever()
+
+if __name__ == '__main__':
+    main()
+</code></pre>
+<p>Run it with <code>python3 ex2.py</code>. It should work exactly the same as <code>python3 -mhttp.server</code> from the previous exercise.</p>
+<h3>Return HTML from a function instead of a file</h3>
+<pre><code>import http.server
+
+def main():
+    listen_address = ('localhost', 8000)
+    request_handler = MyRequestHandler
+    server = http.server.HTTPServer(listen_address, request_handler)
+    server.serve_forever()
+
+class MyRequestHandler(http.server.BaseHTTPRequestHandler):
+    def write(self, text):
+        self.wfile.write(text.encode('utf-8'))
+
+    def do_GET(self):
+        self.send_response(200)
+        self.send_header('Content-type', 'text/html')
+        self.end_headers()
+        self.write('&lt;html&gt;')
+        self.write('&lt;head&gt;&lt;title&gt;My web server!&lt;/title&gt;&lt;/head&gt;')
+        self.write('&lt;body&gt;Hi there!&lt;/body&gt;')
+        self.write('&lt;/html&gt;')
+
+if __name__ == '__main__':
+    main()
+</code></pre>
+<h3>Make the HTML dynamic!</h3>
+<p>Add <code>COUNTER = 42</code> to the top of the file.</p>
+<p>Then modify your <code>GET</code> handler to return some dynamic HTML! Something like this...</p>
+<pre><code>global COUNTER
+COUNTER = COUNTER + 1
+self.write(f'We have had &lt;b&gt;{COUNTER}&lt;/b&gt; visitors today')
+</code></pre>
+<p>You can try adding other information to the response too:</p>
+<pre><code>self.write('You requested: ' + self.path + '&lt;br&gt;')
+self.write('You are using this client: ' + self.headers.get('user-agent') + '&lt;br&gt;')
+</code></pre>
+<h3>A few things to note</h3>
+<p>Remember doing raw HTTP requests in the previous exercise? With the code above if a client does <code>GET /file.txt</code> then Python's <code>http.server</code> library parses the HTTP request and does something like the following:<br />
+- Creates a new instance of the <code>MyRequestHandler</code> class.<br />
+- Sets the HTTP path as: <code>self.path = '/file.txt'</code> for this new instance.<br />
+- Calls the <code>do_GET()</code> function on this new instance, because the HTTP method was <code>GET</code>.</p>
+<h3>Fix the double-counting bug</h3>
+<p>Notice that if you press <code>ctrl-shift-R</code> to reload your counter is going up by two at a time? If you look at the log you can see this is because Firefox is requesting <code>/favicon.ico</code>. This is the little icon next to the URL in the address bar. Our site isn't fancy enough for this, so we should modify <code>do_GET()</code> to return a <code>404 not found</code> for these requests.</p>
+<p>Something like this:</p>
+<pre><code>if not self.path.endswith('.html'):
+    self.send_response(404)
+    self.end_headers()
+    self.write('File not found')
+    return
+</code></pre>
+<p>Now try visiting some URL that doesn't end with <code>.html</code> and you'll see your 'not found' message.</p>
+\ No newline at end of file
diff --git a/lesson2_dynamic-web.md b/lesson2_dynamic-web.md

new file mode 100644 (file)

index 0000000..cabb972
--- /dev/null
+++ b/lesson2_dynamic-web.md
@@ -0,0 +1,110 @@
+# Lesson 2 the dynamic web
+
+## References:
+- https://docs.python.org/3
+- https://docs.python.org/3/library/http.html
+- https://github.com/python/cpython/blob/3.12/Lib/http/server.py
+- https://docs.python.org/3/tutorial
+- https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals
+- https://developer.mozilla.org/en-US/docs/Glossary/Favicon
+
+## Overview
+
+In the mid to late 1990s the web started becoming dynamic. Web servers might return different content to different users who requested the same URL. A common example of this was a hit counter which would increment each time any user visited the website, or a guest book to allow users to leave messages.
+
+## Learning objective
+
+The goal is to build a simple dynamic web server.
+- Understand how the HTTP method and path map to a Python function call.
+- Build a web server that returns dynamically generated content.
+
+## Exercises
+
+### Build a simple HTTP server in Python
+
+Create a file, `ex2.py` and paste the following into it:
+```
+import http.server
+
+def main():
+    listen_address = ('localhost', 8000)
+    request_handler = http.server.SimpleHTTPRequestHandler
+    server = http.server.HTTPServer(listen_address, request_handler)
+    server.serve_forever()
+
+if __name__ == '__main__':
+    main()
+```
+
+Run it with `python3 ex2.py`. It should work exactly the same as `python3 -mhttp.server` from the previous exercise.
+
+
+### Return HTML from a function instead of a file
+
+```
+import http.server
+
+def main():
+    listen_address = ('localhost', 8000)
+    request_handler = MyRequestHandler
+    server = http.server.HTTPServer(listen_address, request_handler)
+    server.serve_forever()
+
+class MyRequestHandler(http.server.BaseHTTPRequestHandler):
+    def write(self, text):
+        self.wfile.write(text.encode('utf-8'))
+
+    def do_GET(self):
+        self.send_response(200)
+        self.send_header('Content-type', 'text/html')
+        self.end_headers()
+        self.write('<html>')
+        self.write('<head><title>My web server!</title></head>')
+        self.write('<body>Hi there!</body>')
+        self.write('</html>')
+
+if __name__ == '__main__':
+    main()
+```
+
+
+### Make the HTML dynamic!
+
+Add `COUNTER = 42` to the top of the file.
+
+Then modify your `GET` handler to return some dynamic HTML! Something like this...
+
+```
+global COUNTER
+COUNTER = COUNTER + 1
+self.write(f'We have had <b>{COUNTER}</b> visitors today')
+```
+
+You can try adding other information to the response too:
+```
+self.write('You requested: ' + self.path + '<br>')
+self.write('You are using this client: ' + self.headers.get('user-agent') + '<br>')
+```
+
+
+### A few things to note
+
+Remember doing raw HTTP requests in the previous exercise? With the code above if a client does `GET /file.txt` then Python's `http.server` library parses the HTTP request and does something like the following:
+- Creates a new instance of the `MyRequestHandler` class.
+- Sets the HTTP path as: `self.path = '/file.txt'` for this new instance.
+- Calls the `do_GET()` function on this new instance, because the HTTP method was `GET`.
+
+### Fix the double-counting bug
+
+Notice that if you press `ctrl-shift-R` to reload your counter is going up by two at a time? If you look at the log you can see this is because Firefox is requesting `/favicon.ico`. This is the little icon next to the URL in the address bar. Our site isn't fancy enough for this, so we should modify `do_GET()` to return a `404 not found` for these requests.
+
+Something like this:
+```
+if not self.path.endswith('.html'):
+    self.send_response(404)
+    self.end_headers()
+    self.write('File not found')
+    return
+```
+
+Now try visiting some URL that doesn't end with `.html` and you'll see your 'not found' message.
author	James Bunton <jamesbunton@delx.au>
	Mon, 1 Jan 2024 07:52:04 +0000 (18:52 +1100)
committer	James Bunton <jamesbunton@delx.au>
	Mon, 1 Jan 2024 09:27:07 +0000 (20:27 +1100)
Makefile	[new file with mode: 0644]	patch \| blob
exercise2.py	[new file with mode: 0644]	patch \| blob
lesson1_HTTP.html	[new file with mode: 0644]	patch \| blob
lesson1_HTTP.md	[new file with mode: 0644]	patch \| blob
lesson2_dynamic-web.html	[new file with mode: 0644]	patch \| blob
lesson2_dynamic-web.md	[new file with mode: 0644]	patch \| blob