Programming Blog

Feature Engineering with Geospatial Data: Binned Statistics in GEE

In quantitative research, generating novel features from alternative datasets is often a primary source of identifying variation and predictive performance. Geospatial data, such as weather patterns and agricultural metrics, provides a rich source for these signals. A common feature engineering task is to aggregate one spatial dataset by the discrete bins of another—for example, calculating…

Learning Resources

General Learning Resources Statistical Learning and Computer Science Economics Topics Public Policy Career Topics Innovation

Retrieval Augmented Generation (RAG)

In short, RAG is a style of LLM usage where you give the LLM more information on top of your prompt. Between prompt engineering and RAG, you can dramatically increase the ability of the model to predict an accurate response. This can be in the form of internet searches the agent performs automatically (like Gemini…

Programming with LLMs (not just generating code with a chatbot)

This page is about using LLM APIs in a programming project. Not to be confused with generating and editing code using a chatbot. Under construction: this is currently a place for me to dump links and quick thoughts. It might turn into a real post one day. Resources Python to use LLM libraries Python is…

Transformer Architecture

This is a page to store resources and thoughts about transformer architecture and its applications. Components of the Transformer Architecture Tokenization The tokenization step takes the raw input and partitions it into tokens, bit-sized chunks of the data. Tokens are the unit of analysis of the transformer model, and the universe of all possible tokens…

LLM Tips and Tools

Large language models (LLMs) utilize a form of neural network architecture to learn the relationships between words and predict the next word in a sentence (more general than words with multimodal models but words and sentences are a fine mental model to have). Below are some resources and tips I am collecting to better use…

LLM Prompts

Some of my recent prompting strategies… General Tips Keep in mind: LLMs are charting a way through a latent topic space. Prompts are the starting, pre-defined path on a longer journey, and you are asking the model to auto-complete that journey. Adding specific features to prompts helps in two ways: it helps the model start…

Earth Engine Guide

I’m working on a guide to help economists get started using Google Earth Engine. The first draft of this guide is on this google doc, which seems fairly effective so far. Some of the contents:- common confusion about the client-side and server-side of Earth Engine functions- a typical workflow for developing Earth Engine code- notes…

Some tools for Bayesian Posterior Estimation

Below are some tools I’m familiar with as of 2023-04-07. Stan seems the most promising. I plan to come back and update this post in the future once I have finished a project. Stan – a power tool for Bayesian probability models Assuming you already want to sample from some posterior distribution… what tools are…

Something went wrong. Please refresh the page and/or try again.


Follow My Blog

Get new content delivered directly to your inbox.