Learning Consumer Preferences

Giving Deep Attention to Consumer Preferences with Large Language Models

Joshua Foster & Fredrik Ødegaard @ Ivey Business School

This paper proposes new methods for

Hedonic Demand Estimation

using large language models.

Hedonic Demand Estimation

An approach to measuring how the characteristics of a good or service affect its price and the quantity demanded.

What features?

Functional form?

Heterogeneity?

Market dynamics?

Question.

What if a large language model (LLM) could infer these relationships directly by mapping

vehicle description $\rightarrow$ market outcomes

when constrained by a structural model?

What we do.

Collect data from 80,000+ vehicle auctions.
Generate 240,000+ synthetic descriptions.

Perform a two stage estimation:

Train an LLM to encode a vehicle's description and predict the observed market outcomes.
Use the vehicle's encoding to recover the assumed demand primitives that generated the observed market outcomes.

Apply the model to market counterfactuals.

Assumed data-generating process.

$V|d$ and $N|d$ are independent for any $d$.

Private valuations are drawn iid from $V|d$.

Assumed data-generating process.

$V|d$ and $N|d$ are independent for any $d$.

Private valuations are drawn iid from $V|d$.

Pricing Mechanism: Open Cry English Auction

Assumption: No participating bidder will allow another bidder to claim the auction object with a bid they are willing and able to beat.

\[ \pi_i= \begin{cases} v_i-b_i & \text{if }\; b_i>\max\left(b_{-i}\right) \\ 0 & \text{otherwise.} \end{cases} \] \[ b_i^*=v_i \;\;\;\;\;\;\;\;\;\;\text{ if } v_i<\max\left(v_{-i}\right) \]

Some bids reliably reveal valuations...if:

There is no snipe bidding.
There isn't excessive jump bidding.
Bidders are present at the end of the auction.

Auction views/"watchers" $\propto$ market size.

Stage 1: point predict these market outcomes from the vehicle's description.

Stage 2: append a neural network that semi-nonparametrically estimates $V|d$, $N|d$.

Description $d$
"This 1974 BMW 3.0CS was refurbished before being acquired by the current owner on BaT in November 2021, and work included rust remediation, a repaint, an interior refresh, and updates to the brake and suspension systems..."

Description $d$
"This 1974 BMW 3.0CS was refurbished before being acquired by the current owner on BaT in November 2021, and work included rust remediation, a repaint, an interior refresh, and updates to the brake and suspension systems..."

Description $d$	"Demand Vector" $m: d\rightarrow e$	Density Estimator $h: e\rightarrow f_V, g_N$
[713, 15524, 8588, 155, 4, 288, 6842, 21, 17880, 6555, 137, 145, 3566, 30, 5, 595, 1945, 15, 5597, 565, 11, 759, 8835, 6, 8, 173, 1165, 18309, 22104, 16546, 6, 10, 2851, 12042, 6, 41, 6291, 14240, 6, 8, 3496, 7, 5, 18507, 8, 5436, 1743, ...]

Synthetic data: ~240,000 auction evaluations.


							experts = [
							  "Independent Vehicle History Report Provider",
							  "Certified Automotive Mechanic",
							  "Car Enthusiast Club President",
							  "Vehicle Appraiser",
							]

							specialties = [
							  "providing detailed history reports of vehicles, revealing past ownership, accident history, service records, and other vital information that can impact the vehicle's value",
							  "performing thorough inspections of vehicles, identifying any mechanical issues, maintenance needs, or potential future problems.",
							  "offering extensive knowledge about specific makes or models, including common issues, maintenance tips, and overall satisfaction.",
							  "specializing in determining the market value of used vehicles based on factors like make, model, year, mileage, condition, and market trends.",
							]

							authentic_description = dataset[car]['description']

							for i in range(len(experts)): 
							  system_message = f"""
							  You are an {experts[i]}. You specialize in {specialties[i]}. When someone gives you a description of a vehicle or vehicles parts, you offer a detailed representation of your expert point of view.
							  """

							  user_message = f"""
							  Please tell me what you think of this vehicle. 
							  {authentic_description}
							  """

							  synthetic_description[i] = client.chat.completions.create(
							    model="gpt-3.5-turbo-1106",
							    messages=[
							      {"role": "system", "content": system_message},
							      {"role": "user", "content": user_message}
							    ]
							  )

Validation Set Prediction Error

Absolute