Lectures

  1. Lecture 1: Introduction to Urban Data Science (Sep 2 and 4, 2020)
  2. Lecture 2: Spatial and Urban Data (Sep 9, 2020)
  3. Lecture 3: Data Grammar (Sep 11, 2020)
  4. Lecture 4: Data Engineering (Sep 18, 2020)
  5. Lecture 5 EDA and Visualisation: (Sep 23, 2020)
  6. Lecture 6: Geo-Visualisation (Sep 25, 2020)
  7. Lecture 7: Networks and Spatial Weights (Sep 30, 2020)
  8. Lecture 8: Exploratory Spatial Data Analysis (Oct 2, 2020)
  9. Lecture 9: Machine Learning for Everyone (Oct 7, 2020)
  10. Lecture 10: Anatomy of a Learning Algorithm (Oct 9, 2020)
  11. Lecture 11: Clustering (Oct 14, 2020)
  12. Lecture 12: Dimensionality Reduction (Oct 16, 2020)
  13. Lecture 13: Spatial Density Estimation (Oct 21, 2020)
  14. Lecture 14: Responsible Data Science (Oct 23, 2020)

Note: The slides will be updated latest a night before the lecture.



Before starting this course, watch this video by Khalid Kadir about a reflection on poverty (an example of a social problem), expertise and equity. This representation is an example of how experts create boxes around their craft. As a data scientist or a future AI expert, it is our responsibility to step out of those boxes and engage with communities to strive for just outcomes.



Lecture 1 - Introduction to Urban Data Science

To do before class

As a way to whet your appetite about the content of the first class, I recommend you:

The contents of this lecture are loosely based on, and explored into further detail, in the following four references :

  • [Recommended] “Chapter 1: Introduction” (Schutt & O’Neil, 2013). Free sampler of the book containing the chapter available online ( html, pdf).
  • Excellent overview of Data Science (Donoho, 2017).
  • A Geographic take on Data Science, proposing a new field (Singleton & Arribas-Bel, 2019).
  • A critical approach to Data Science for Cities

References

  1. Schutt, R., & O’Neil, C. (2013). Doing data science: Straight talk from the frontline. “ O’Reilly Media, Inc.”
  2. Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745–766.
  3. Singleton, A., & Arribas-Bel, D. (2019). Geographic Data Science. Geographical Analysis.



Lecture 2 - Spatial and Urban Data

Slides

To do before class

  • Watch the TED talk by Carlo Rati about MIT’s SENSEable City Lab projects: excellent set of examples
  • Read the New York Times piece on US buildings map
  • Explore the GHSL Dataset, by the European Commission
  • The part of the lecture on new sources of data relies on (Arribas-Bel, 2014) and (Lazer & Radford, 2017).
  • (Goodchild, 2007): a classic on the rise of volunteered geographic information.
  • (Kitchin, 2014): recent book on the data revolution from a Social Science/Human geography perspective.

References

  1. Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the understanding of cities. Applied Geography, 49, 45–53.
  2. Lazer, D., & Radford, J. (2017). Data ex Machina: Introduction to Big Data. Annual Review of Sociology, (0).
  3. Goodchild, M. F. (2007). Citizens as sensors: the world of volunteered geography. GeoJournal, 69(4), 211–221.
  4. Kitchin, R. (2014). The data revolution: Big data, open data, data infrastructures and their consequences. Sage.



Lecture 3 - Data Grammar

Slides

To do before class

  • A cheatsheet (such a misnomer – nobody is cheating and it is a helpful and beautiful resource) on Data Wrangling with Pandas that you may want to stick to your wall or put as your screensaver to save time on finding useful and operational codes.



Lecture 4 - Data Engineering

Slides

To do before class

The contents of this lecture are loosely based on, and explored into further detail, in the following two references :




Lecture 5 - EDA and Visualisation

This lecture is partly inspired by (Tufte, 1983).

Slides

To do before class

  • Berinato, S. Visualisations That Really Work, Harvard Business Review, Jun 2016
  • Wainer, H. How to Display Data Badly. The American Statistician 1984; 38: 137-1470
  • Alberto Cairo’s weblog called The Functional Art about information design, and visualisation is an excellent resource for improving your visualisations.
  • (Yau, 2011)’s book “Visualize this” is a good general introduction to visualisation.
  • Check out From Data to Vis chart selector for selecting the right charts

References

  1. Tufte, E. R. (1983). The visual display of quantitative information. Graphics press Cheshire, CT.
  2. Yau, N. (2011). Visualise this: the FlowingData guide to design, visualisation, and statistics. John Wiley & Sons.



Lecture 6 - Geo-Visualisation

This lecture is partly inspired by (Rey, 2015).

Slides

To do before class

  • Watch this lecture on “Statistical maps” by Luc Anselin ( link to 25min video).
  • Read the Conversation piece on the Flint case, where the MAUP played a key role.
  • Spend the rest of the prep hour browsing through Nathan Yau’s excellent blog, Flowing Data.

References

  1. Rey, S. (2015). Geovisualization. In GPH471: Geographic Information Analysis. Lecture slides from a course taught at Arizona State University.
  2. Brewer, C. (2015). Designing better Maps: A Guide for GIS users. ESRI Press.



Lecture 7 - Networks and Spatial Weights

Slides

To do before class

  • Read Eli Knaap’s blog on Measuring Urban Segregation with Spatial Computation
  • Watch this lecture on “Spatial Weights” by Luc Anselin ( link to 34min video). Keep in mind the motivation, in this case, is focused on spatial regression.
  • Lecture on “Special lag” by Luc Anselin ( link to video, you can ignore the last five minutes as they are a bit more advanced).
  • Check out Geoff Boeing’s computational notebook showcasing the use of OSMNX- a python library for processing street networks as network objects- with a case of Urban Street Network Analysis
  • For advanced and in-detail treatment, (Anselin & Rey, 2014) is an excellent reference.

References

  1. Anselin, L., & Rey, S. J. (2014). Modern Spatial Econometrics in Practice: A Guide to GeoDa, GeoDaSpace and PySAL. Chicago, IL: GeoDa Press LLC.



Lecture 8 - Exploratory Spatial Data Analysis

Slides

To do before class

  • Watch this lecture on “Spatial Autocorrelation (Background)” by Luc Anselin. [ Part I][ Part II]
  • (Anselin, 1996) reviews the use of the Moran plot as an ESDA tool (You may access it on Scihub using the doi https://doi.org/10.1111/j.1467-9787.1996.tb01101.x).
  • (Symanzik, 2014) introduces the main concepts behind ESDA.
  • (Haining, 2014) is an excellent historical perspective of the origins and motivations behind most of the global and local measures of spatial autocorrelation.

References

  1. Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. Spatial Analytical Perspectives on GIS, 111, 111–125.
  2. Symanzik, J. (2014). Exploratory Spatial Data Analysis. In Handbook of Regional Science (pp. 1295–1310). Springer.
  3. Haining, R. (2014). Spatial Data and Statistical Methods: A Chronological Overview. In Handbook of Regional Science (pp. 1277–1294). Springer.



Lecture 9 - Machine Learning for Everyone

Kindly buy The Hundred-Page Machine Learning Book as some chapters will be used in some topics from this point onwards and it is generally a fantastic book to have. If you cannot or do not want to spend $20.00 on the e-copy, email me, and we will figure something out. The author has invested a lot in writing this book, and it is an excellent resource on Machine Learning, even beyond this class.

Slides

To do before class

A Visual Introduction to Machine Learning by r2d3. It is a beautiful two-part series.

The contents of this lecture are loosely based on, and explored into further detail, in the following two references :




Lecture 10 - Anatomy of a Learning Algorithm

Slides

To do before class




Lecture 11 - Clustering

Slides

To do before class

  • Talk on “Geodemographics and the Internal Structure of Cities” by Prof. Alex Singleton ( link to 50min. video).
  • Chapters 1 and 2 in (Webber & Burrows, 2018) provides a fascinating account of the origins of Geodemographic classifications.
  • Chapter 7 in (Brunsdon & Singleton, 2015): Geodemographic Analysis, by Alexandros Alexiou and Alex Singleton.
  • (Duque, Ramos, & Suriñach, 2007) is an excellent review of regionalisation algorithms, but it is an excellent read.
  • (Oke et al., 2019) provides a comprehensive urban framework using hierarchical clustering methods and diverse set of abundant data.

References

  1. Webber, R., & Burrows, R. (2018). The Predictive Postcode: The Geodemographic Classification of British Society. SAGE.
  2. Brunsdon, C., & Singleton, A. (2015). Geocomputation: A Practical Primer. SAGE.
  3. Duque, J. C., Ramos, R., & Suriñach, J. (2007). Supervised regionalisation methods: A survey. International Regional Science Review, 30(3), 195–220.
  4. Oke, J. B., Aboutaleb, Y. M., Akkinepally, A., Azevedo, C. L., Han, Y., Zegras, P. C., … & Ben-Akiva, M. E. (2019). A novel global urban typology framework for sustainable mobility futures. Environmental Research Letters, 14(9), 095006.



Lecture 12 - Dimensionality Reduction

Slides

To do before class

  • Read through this excellent step-wise example of Principal Component Analysis using airport delay data
  • Read this excellent community-driven explanation of PCA on StackExchange.

The contents of this lecture are loosely based on, and explored into further detail, in the following reference :




Lecture 13 - Spatial Density Estimation

Slides

To do before class

  • Lecture on “Point Pattern Analysis Basics” by Luc Anselin ( link to 45min video, and link to a more recent 6 min intro).
  • This class was partially based on (Rey, 2015).
  • The slides for this lecture were also inspired by Part 6 in (C. Brunsdon, 2015).

References

  1. Rey, S. (2015). Point Pattern Basics. In GPH471: Geographic Information Analysis. Lecture slides from a course taught at Arizona State University.
  2. C. Brunsdon, L. C. (2015). An Introduction to R for Spatial Analysis and Mapping. SAGE Publications Ltd.



Lecture 14 - Responsible Data Science

Slides

To do before class

References

  1. El-Geneidy, A., Levinson, D., Diab, E., Boisjoly, G., Verbich, D., & Loong, C. (2016). The cost of equity: Assessing transit accessibility and social disparity using total travel cost. Transportation Research Part A: Policy and Practice, 91, 302-316.