This lesson is in the early stages of development (Alpha version)

Introduction to the Web and Online APIs: Glossary

Key Points

HTTP
  • A protocol is a standard for communicating data across a network. A port is a number to identify which program should process a network connection.

  • HTTP is the protocol originally designed for requesting and receiving Web pages, but now also used as the basis for a variety of APIs. HTTPS is the encrypted version of HTTP.

  • Every page on the world wide web is identified with a URL or Uniform Resource Locator.

  • A request is how you tell a server what you want to see. A response will either give you what you asked for, or tell you why the server can’t do that. Both requests and responses have a header, and optionally a body.

  • We can make requests and receive responses, as well as see their headers, using curl.

What do APIs look like?
  • Interact with web APIs by sending requests to an endpoint representing a function of interest. Parameters can be encoded into the request, or attached as e.g. JSON.

  • Responses are typically plain text or JSON, but could be anything.

  • Most APIs require some form of authentication. This can be by username and password, or via a token.

  • Which choices a given API makes for each of these will be described in the API’s documentation.

dicts
  • A dict is a collection of key-value pairs.

  • Create a dict with the syntax {key1: value1, key2: value2, ...}.

  • Get and set elements of a dict with square brackets: my_dict[key1] = new_value1.

Requests
  • GET requests are used to read data from a particular resource.

  • POST requests are used to write data to a particular resource.

  • GET and POST methods may require some form of authentication (POST usually does)

  • The Python requests library offers various ways to deal with authentication.

  • curl can be used instead for shell-based workflows and debugging purposes.

Elements of Web Scraping with BeautifulSoup
  • A BeautifulSoup object can be navigated in many ways:

  • Use find to look for the first element that matches the given criteria in a subtree

  • Use find_all to obtain a list of elements that matches the given criteria in a subtree

  • Use find_parents to get the list of ancestor of the given element

Glossary

FIXME