Check out our new Proxy Formatter
BlogHow to Build an Amazon Price Tracker with Python

How to Build an Amazon Price Tracker with Python

Amazon Price Tracker.png

If you want to monitor Amazon prices without manually refreshing product pages every day, you need an automated solution. Manual price tracking can introduce errors and cause you to miss deals, since Amazon changes prices multiple times per day on many products.

This guide walks you through building a fully functional Amazon price tracker with Python. You'll learn how to scrape product data, store price history, set up price-drop alerts, and handle technical challenges such as rate limiting and IP bans.

What Is an Amazon Price Tracker?

An Amazon price tracker is a script or application that monitors product prices over time. It connects to Amazon product pages, extracts the current price, saves that data, and optionally notifies you when prices drop below a threshold you set.

Existing tools like Keepa or CamelCamelCamel do the job, but third-party trackers may limit automation and customization. Some tools provide raw data via paid APIs, but you may still prefer a custom tracker for full control over storage, alerts, and integrations.

With a custom Python tracker, you can monitor unlimited products, run it on your own schedule, integrate with your existing tools, and build features that those services don't offer.

Prerequisites

Before we dive into creating the tracker, make sure you have the following installed on your machine:

  • Python 3.8 or higher: Download it from python.org if you haven't already, then verify your installation by running python --version in your terminal.
  • pip (Python package manager): This comes bundled with Python installations, and you'll use it to install the required libraries.
  • A code editor: VS Code, PyCharm, or even a basic text editor will work.
  • Basic Python knowledge: You should understand variables, functions, loops, and how to run Python scripts.

Building an Amazon Price Tracker: Step-by-Step

Use this guide to build a simple Amazon price tracker in Python. You’ll start with a one-URL scraper, then add price history, email alerts, and scheduling. After that, you’ll harden it against blocks and generate a basic chart so you can see price movement over time.

Step 1: Set Up Your Project Environment

A clean project environment keeps your dependencies isolated and your scraper portable. You don't want system-wide packages interfering with your project, and you don't want your scraper breaking when you update something unrelated. Start by creating a dedicated folder for your price tracker.

mkdir amazon-price-tracker

cd amazon-price-tracker

Next, create a virtual environment inside that folder. This isolates each package you install from your system Python, preventing version conflicts and making the project easier to share or rebuild later.

#Create virtual environment

python -m venv venv

After creating it, activate the virtual environment. You should see (venv) appear at the beginning of your terminal prompt, confirming that every package you install from this point goes only into this environment.

# Activate on Windows
venv\Scripts\activate

# Activate on macOS/Linux
source venv/bin/activate
Virtual environment activated.webp

Now install the libraries your tracker needs.

pip install requests beautifulsoup4 pandas schedule matplotlib

Each library serves a specific purpose in your scraper:

  • requests sends HTTP requests to fetch web pages. When your scraper asks for an Amazon product page, requests handles the network call and returns the HTML content from Amazon's server.
  • BeautifulSoup parses HTML and extracts data from it. Raw HTML is just text, but BeautifulSoup turns that text into a searchable structure, letting you find specific elements like product titles, prices, and ratings without writing complex string manipulation code.
  • pandas handles data storage and manipulation. It organizes your scraped prices into rows and columns, making it simple to save data to CSV files, calculate statistics, or compare prices over time.
  • schedule automates running your tracker at set intervals. It runs continuously in your terminal session. To run it in the background, use a process manager (system service / Task Scheduler / cron / Docker).

Once done, create a file called tracker.py in your project folder. This will be your main script.

Step 2: Understand Amazon's Page Structure

Before writing any extraction code, you need to understand how Amazon structures its product pages. Your scraper finds data by looking for specific HTML elements, so you need to know exactly where Amazon puts the information you want.

Open any Amazon product page in your browser, then right-click on the price and select Inspect to open Developer Tools. This shows you the HTML underlying what you see on screen.

Inspecting price element in DevTools.webp

Product prices typically appear in elements with specific class names or IDs. Amazon uses classes like a-price-whole for the dollar amount and a-price-fraction for cents. The product title usually sits in a span element with the ID productTitle.

Spend a few minutes inspecting different product pages before you start coding. Look at how prices display for items on sale versus regular prices, and check if the title structure stays consistent.

Note: Amazon’s HTML changes sometimes. A selector that works today may stop working later, and different categories can use different layouts.

This is why good error handling becomes essential when your tracker runs unattended. Your script needs to handle missing elements gracefully rather than crashing when Amazon moves something around.

If you want a quick walkthrough with examples, you can also reference our guide on scraping Amazon product information.

Step 3: Build the Basic Scraper

Note: The following steps explain each component of the tracker. If you want to skip ahead, the complete script is in Step 10. Otherwise, follow along to understand how each piece works.

With your environment ready and the page structure understood, you can write the core scraping function. This function takes a product URL, fetches the page, and extracts the price and title.

However, the first challenge is making your request look legitimate. Amazon might block requests that don't include realistic browser headers, and a raw request with no headers often triggers blocks or returns different content than what you see in a browser.

That’s why, including a User-Agent header that identifies your request as coming from a real browser may help. Also, the Accept-Language header makes your scraper match normal browser traffic. In some setups, Amazon may return a currency different from what you see in your browser, so the example includes a simple USD preference cookie.

import requests
from bs4 import BeautifulSoup
from decimal import Decimal, InvalidOperation
import random
import re
import time

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0",
]

def get_random_headers():
    return {
        "User-Agent": random.choice(USER_AGENTS),
        "Accept-Language": "en-US,en;q=0.9",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Connection": "keep-alive",
        "Referer": "https://www.amazon.com/",
        "Upgrade-Insecure-Requests": "1",
    }
def parse_price(text):
    """
    Extract the first number-like token and convert it to Decimal.
    This avoids accidentally merging other numbers (shipping, fees, other offers).
    """
    if not text:
        return None
    matches = re.findall(r"\d[\d.,]*", text)
    if not matches:
        return None

    token = matches[0]

    last_dot = token.rfind(".")
    last_comma = token.rfind(",")

    if last_dot > last_comma:
        decimal_sep = "."
        thousands_sep = ","
    else:
        decimal_sep = ","
        thousands_sep = "."
    token = token.replace(thousands_sep, "")
    if decimal_sep == ",":
        token = token.replace(",", ".")

    try:
        return Decimal(token)
    except (InvalidOperation, ValueError):
        return None

If you want to match your own browser exactly, open DevTools in Chrome, go to the Network tab, click any request, and find the User-Agent under Request Headers. You can also read our BeautifulSoup web scraping guide for additional reference if you want a deeper refresher on parsing HTML and selecting elements with BeautifulSoup.

Finding browser's User-Agent.webp

Now write the main scraping function. It sends a GET request to Amazon, parses the HTML response, and extracts the title and price.

Instead of manually stitching together the “whole” and “fraction” spans (which can vary by layout and locale), this version prefers parsing a full formatted price string (often found in a-offscreen), then falls back to a few other common price locations.

def get_product_data(url):
    """
    Fetch an Amazon product page and extract title and price.
    Returns a dictionary with product data or None if the request fails.
    """
    time.sleep(random.uniform(2, 5))

    cookies = {
        "i18n-prefs": "USD",
        "lc-main": "en_US",
    }

    try:
        response = requests.get(
            url,
            headers=get_random_headers(),
            cookies=cookies,
            timeout=15,
            allow_redirects=True
        )
        response.raise_for_status()
    except requests.RequestException as e:
        print(f"Request failed: {e}")
        return None

    soup = BeautifulSoup(response.content, "html.parser")

    # Extract product title
    title_element = soup.find("span", {"id": "productTitle"})
    title = title_element.get_text(strip=True) if title_element else None

    # Extract price (try common containers first, then fall back)
    price = None

    selectors = [
        "#corePriceDisplay_desktop_feature_div span.a-offscreen",
        "#corePrice_feature_div span.a-offscreen",
        "#apex_desktop span.a-price span.a-offscreen",
        "#priceblock_ourprice",
        "#priceblock_dealprice",
        "#priceblock_saleprice",
        "span.a-offscreen",  # last fallback
    ]

    for sel in selectors:
        el = soup.select_one(sel)
        if not el:
            continue
        raw = el.get_text(" ", strip=True)
        price = parse_price(raw)
        if price is not None:
            break

    return {
        "title": title,
        "price": price,
        "url": url
    }   

Finally, test your scraper with a single product to confirm it works. For this example, we'll extract data from the GoPro camera listing. When you copy a product URL from Amazon, it usually looks like this:

https://www.amazon.com/GoPro-Hero-Ecommerce-Packaging-Waterproof/dp/B0FCTZ1GLY/ref=sr_1_3?crid=2U2GGU9SXZX7V&dib=eyJ2IjoiMSJ9.JTFYjgNcJAf7XIdIpKBf0YOwybKBzEXu-YShfncfvMZY5FoYuXMPd1QAJopsoUMVe-cpI09Mnkd2DiO6Aa4XNcUonARj15DnzOH7pDs5G26-zN_Ys-ZakWBv6xbNSlYk4LhV0iLj-KnvTS1yfCDtt9BmPdYnRQ6Zs5sVKut3bSE1ynUipclxYRwDQ2fJKRNQnwsUfnk4z0uqXLysb10ycnx-RK-lZXmBmkUlCDCeVEA.6tLC3i8RYKeQv7LOuB-eTKTJFR1x_ekG3XM5oYPLhEM&dib_tag=se&keywords=go+pro&qid=1770191034&sprefix=go+pr%2Caps%2C408&sr=8-3

Most of that URL is tracking parameters and search data. The only part you need is the 10-character code after /dp/. That's the ASIN (Amazon Standard Identification Number), a unique identifier for every product. In this case, the ASIN is B0FCTZ1GLY.

Once you have the ASIN, rebuild the URL into a clean format:

https://www.amazon.com/dp/B0FCTZ1GLY

This shorter URL is more stable and works the same way. Now run the script and check the output. If you see the product title and price printed correctly, your basic scraper works. If you get None values or errors, double-check your selectors against the actual page HTML.

if __name__ == "__main__":
    test_url = "https://www.amazon.com/dp/B0FCTZ1GLY"
    result = get_product_data(test_url)

    if result and result["price"] is not None:
        print("Title:", result["title"])
        print("Price: $", result["price"])
    else:
        print("Failed to fetch product data")
Testing the basic scraper.webp

Step 4: Store Price History

Scraping prices once isn't useful. The value comes from tracking changes over time, so you need to store each price reading with a timestamp. This lets you see trends, identify patterns, and know when prices hit their lowest point.

pandas makes data storage straightforward, and you'll save prices to a CSV file, which is simple to work with and opens in any spreadsheet application.

import pandas as pd
from datetime import datetime
import os
def save_price(product_data, filename="price_history.csv"):
    """
    Save product data with timestamp to a CSV file.
    Creates the file if it doesn't exist, appends if it does.
    """
    if not product_data or product_data.get("price") is None:
        print("No valid data to save")
        return

    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

    new_entry = {
        "timestamp": timestamp,
        "title": product_data["title"],
        "price": str(product_data["price"]),
        "url": product_data["url"]
    }

    if os.path.exists(filename):
        df = pd.read_csv(filename)
        df = pd.concat([df, pd.DataFrame([new_entry])], ignore_index=True)
    else:
        df = pd.DataFrame([new_entry])

    df.to_csv(filename, index=False)
    print(f"Saved: ${new_entry['price']} at {timestamp}")

This function checks if your CSV file already exists. If it does, the function loads the existing data and appends the new price reading. If not, it creates a fresh DataFrame. Either way, it saves everything back to the CSV file.

Each row in your CSV will contain a timestamp, the product title, the current price, and the URL. Over time, this file becomes your price history database.

Price history CSV file.webp

For larger projects or when tracking hundreds of products, consider using a proper database instead of CSV. SQLite works well for single-user applications because it stores everything in a single file and requires no server setup. PostgreSQL or MongoDB handle concurrent access better if you're building something more robust.

Step 5: Add Price Drop Alerts

The real value of a price tracker comes from alerts. You want to know immediately when a product drops below your target price, not discover it days later when the deal has expired.

Python's built-in smtplib library lets you send emails directly from your script, and you'll configure it to connect to Gmail's SMTP server and send yourself a notification when prices drop. This step uses Decimal for price math, so comparisons and savings stay consistent.

For testing, you can set a very high target price so the alert triggers on the first run. Once you confirm emails work, set the target back to your real threshold.

# Set this high to force an email during testing
TARGET_PRICE = Decimal("9999.99")

def send_email_alert(product_data, target_price):
    sender = os.environ.get("EMAIL_SENDER")
    password = os.environ.get("EMAIL_PASSWORD")
    recipient = os.environ.get("EMAIL_RECIPIENT")

    if not all([sender, password, recipient]):
        print("Email credentials not configured. Skipping alert.")
        return False

    current_price = product_data.get("price")
    if current_price is None:
        return False

    current_price = Decimal(str(current_price))
    target_price = Decimal(str(target_price))
    savings = target_price - current_price

    subject = f"Price Drop Alert: {product_data['title'][:50]}"

    body = f"""
Good news! A product you're tracking dropped below your target price.

Product: {product_data['title']}
Current Price: ${current_price}
Target Price: ${target_price}
You Save: ${savings}

Buy now: {product_data['url']}
""".strip()

    msg = MIMEMultipart()
    msg["From"] = sender
    msg["To"] = recipient
    msg["Subject"] = subject
    msg.attach(MIMEText(body, "plain"))

    try:
        with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
            server.login(sender, password)
            server.sendmail(sender, recipient, msg.as_string())
        print(f"Alert sent to {recipient}")
        return True
    except Exception as e:
        print(f"Failed to send email: {e}")
        return False

A few notes on the email setup. Gmail requires you to use an App Password rather than your regular password if you have two-factor authentication enabled. Create one in your Google Account security settings under App passwords.

Store credentials in environment variables rather than hardcoding them in your script, since hardcoded passwords are a security risk, especially if you ever share your code.

Google Account App Password.webp

Now create a function that combines price checking with alerting. This function fetches the current price, saves it to your history file, and sends an alert if the price is at or below your target. The return value tells you whether an alert was triggered, which is useful for logging.

def check_price_and_alert(url, target_price):
    product_data = get_product_data(url)
    if not product_data:
        print(f"Could not fetch price for {url}")
        return False

    save_price(product_data)

    current_price = Decimal(str(product_data["price"]))
    target_price = Decimal(str(target_price))

    if current_price <= target_price:
        return send_email_alert(product_data, target_price)

    print(f"Current price ${current_price} is above target ${target_price}")
    return False

If you see “Could not fetch price,” Amazon may be returning a block page or a different layout, so inspect the HTML and adjust selectors if needed.

And, before you run the script, set the environment variables in the same Terminal session. If you don’t, os.environ.get(...) returns None, and the alert code will skip sending email. If you open a new terminal window, you’ll need to set them again. On macOS/Linux you can set them with export as shown below.

Use a Gmail App Password for EMAIL_PASSWORD. If Google displays it with spaces, retype the spaces manually or remove spaces entirely so you don’t paste hidden characters. Here’s an example for demonstration purposes.

export EMAIL_SENDER="myname@gmail.com"
export EMAIL_PASSWORD="abcdefghijklmnop"
export EMAIL_RECIPIENT="myname@gmail.com"
Email alert terminal prompt.webpPrice drop email copy.webp

Step 6: Automate the Tracker

Running your script manually every few hours defeats the purpose of building a tracker. Use the schedule library to check prices automatically at regular intervals, and set it to run every six hours or so. Why not every minute?

Two reasons. First, most product prices don't change that frequently, so checking once or twice per day catches the meaningful changes without wasting resources. Second, frequent requests might increase your chances of getting blocked since Amazon monitors request patterns, and hammering their servers every minute looks suspicious.

If you’re testing alerts, you can temporarily set a high target price so an email triggers on the first run. After you confirm alerts work, set the target back to your real threshold.

import schedule
import time
import random
from datetime import datetime
from decimal import Decimal

# Products to track: list of dictionaries with url and target_price
PRODUCTS = [
    {
        # GoPro example (ASIN: B0FCTZ1GLY)
        "url": "https://www.amazon.com/dp/B0FCTZ1GLY",
        "target_price": Decimal("180.99")
    },
    {
        "url": "https://www.amazon.com/dp/B0CL5KNB9M",
        "target_price": Decimal("149.99")
    },
]

def run_tracker():
    """
    Check prices for all tracked products with random delays between requests.
    """
    print(f"\n{'='*50}")
    print(f"Price check started at {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"{'='*50}")

    for product in PRODUCTS:
        try:
            check_price_and_alert(product["url"], product["target_price"])
        except Exception as e:
            print(f"Error checking {product['url']}: {e}")

        # Random delay between requests to appear more human-like
        time.sleep(random.uniform(3, 8))

    print("Price check complete.\n")
if __name__ == "__main__":
    # Run immediately on start
    run_tracker()

    # Schedule to run every 6 hours
    schedule.every(6).hours.do(run_tracker)

    print("Tracker is running. Press Ctrl+C to stop.")

    while True:
        schedule.run_pending()
        time.sleep(60)

The random delay between products serves the same purpose. If you check 10 products in rapid succession with exactly two seconds between each request, that pattern is obviously automated.

Adding randomness makes your traffic look more like a human clicking through products at their own pace. In practice, you'll find that spacing out requests and varying the timing significantly reduces the number of blocks you encounter.

Step 7: Handle Anti-Bot Measures

Amazon may block automated requests depending on request rate, IP reputation, and page layout. If your tracker starts returning missing titles/prices or unexpected HTML, slow down your request pace, rotate headers, and add retries.

Rate limiting is common. If you send too many requests in a short period, Amazon can throttle or temporarily block your IP. Spacing out requests with random delays helps, which you already added in Step 6, but you may need longer delays if you're tracking many products.

CAPTCHAs can appear when Amazon suspects automation. If your scraper hits one, it usually won’t be able to proceed until you solve it manually. Avoiding CAPTCHAs is mostly about request pacing, consistent headers, and IP reputation.

IP bans can happen after repeated suspicious activity. When that happens, requests may start returning block pages even for normal browsing from that IP, at least for a while.

To reduce blocks, rotate User-Agent strings, add jitter between requests, and include retry logic with backoff. When you get throttled or served an interstitial (commonly 429 or 503), wait longer before retrying.

import random
import time
import requests

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Safari/605.1.15",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 Edg/120.0.0.0",
]

def get_random_headers():
    return {
        "User-Agent": random.choice(USER_AGENTS),
        "Accept-Language": "en-US,en;q=0.9",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Connection": "keep-alive",
        "Referer": "https://www.amazon.com/",
        "Upgrade-Insecure-Requests": "1",
    }

def fetch_with_backoff(session, url, cookies=None, max_retries=3):
    """
    Make a request with jitter + backoff.
    Returns a response (status 200) or None.
    """
    cookies = cookies or {}

    for attempt in range(1, max_retries + 1):
        # Random delay before each request
        time.sleep(random.uniform(2, 7))

        try:
            response = session.get(
                url,
                headers=get_random_headers(),
                cookies=cookies,
                timeout=20,
                allow_redirects=True,
            )

            if response.status_code == 200:
                return response
           # 429 is the clearest "too many requests"
            # 503 can also mean interstitials, bot checks, or temporary issues
            if response.status_code in (429, 503):
                wait = 5 * attempt + random.uniform(0, 3)
                print(
                   f"Got {response.status_code} (temporary block or interstitial). "
                   f"Waiting {wait:.1f}s, retry {attempt}/{max_retries}"
                )
                time.sleep(wait)
                continue

            print(f"Got status code {response.status_code}. Skipping.")
            return None

        except requests.RequestException as e:
            wait = 5 * attempt + random.uniform(0, 3)
            print(
                f"Request error: {e}. "
                f"Waiting {wait:.1f}s, retry {attempt}/{max_retries}"
            )
            time.sleep(wait)

    print("All retries failed")
    return None

How to use this in your tracker?

Inside your get_product_data() function, replace the direct requests.get(...) call with:

session = requests.Session()
response = fetch_with_backoff(session, url, cookies=cookies, max_retries=3)
if not response:
    return None

Step 8: Use Proxies for Reliable Scraping

For serious price tracking, especially when monitoring many products, you may need proxies. Proxies route your requests through different IP addresses, which makes it harder for Amazon to link repeated requests to one source and block you.

Two common options are residential and datacenter proxies.

  • Residential proxies use real IP addresses assigned by Internet Service Providers (ISPs) to home users. These look like normal consumer traffic, so they tend to last longer on heavily protected sites. They usually cost more than datacenter proxies.
  • Datacenter proxies are faster and cheaper, but Amazon can recognize many datacenter IP ranges and block them more aggressively. They can still work for lighter tracking, but you may see more failures when you scale up.

Here’s an example that shows proxy integration in a way that stays consistent with the request retry logic and the price parsing you added earlier:

import requests
from bs4 import BeautifulSoup

def get_product_with_proxy(url, proxy_url, max_retries=3):
    """
    Fetch product data using a proxy.

    proxy_url format:
    "http://username:password@proxy-server:port"
    """
    session = requests.Session()

    # Apply proxy to the session
    session.proxies.update({
        "http": proxy_url,
        "https": proxy_url,
    })

    # Keep currency / locale consistent (optional, but useful)
    cookies = {
       "i18n-prefs": "USD",
       "lc-main": "en_US",
    }

    response = fetch_with_backoff(session, url, cookies=cookies, max_retries=max_retries)
    if not response:
        print("Proxy request failed after retries.")
        return None

    soup = BeautifulSoup(response.content, "html.parser")

    title_el = soup.find("span", {"id": "productTitle"})
    title = title_el.get_text(strip=True) if title_el else None

    price = extract_price(soup)

    return {"title": title, "price": price, "url": url}

When configuring proxy rotation, the goal is to spread requests across IPs so you don’t hammer the same one repeatedly. Try not to send too many requests from the same IP in a short period. If a proxy starts failing often, remove it from the rotation and move on to the next one.

class ProxyRotator:
    """
    Rotate through a list of proxy URLs to distribute requests.
    """
    def __init__(self, proxy_list):
        self.proxies = list(proxy_list)
        self.current_index = 0

    def get_next_proxy(self):
        if not self.proxies:
            return None

        proxy = self.proxies[self.current_index]
        self.current_index = (self.current_index + 1) % len(self.proxies)
        return proxy

    def remove_proxy(self, proxy):
        if proxy in self.proxies:
            self.proxies.remove(proxy)
            print(f"Removed blocked proxy. {len(self.proxies)} proxies remaining.")

            if not self.proxies:
                self.current_index = 0
                return
            if self.current_index >= len(self.proxies):
                self.current_index = 0

Our residential and ISP proxies are designed to reduce blocks and improve success rates for scraping tasks like price tracking, but your results will also depend on how your scraper behaves. We support HTTP/1 and HTTP/2 across Residential, ISP, and Datacenter proxies. HTTP/3 is available on Residential proxies only. We provide stable connections for automated data collection workflows.

Ping Proxies residential proxy.webp

Step 9: Visualize Price History

Once you've collected price data over time, visualization helps you spot trends and patterns. A chart shows you at a glance whether prices are trending up, down, or holding steady, and it makes the data actionable. matplotlib creates simple charts from your price history CSV.

To generate those charts, create a new file named visualize.py in your project folder and feel free to use this example script.

import os
import matplotlib.pyplot as plt
import pandas as pd

def plot_price_history(csv_file="price_history.csv", product_url=None):
    """
    Generate a price history chart from saved data.
    Optionally filter by product URL.
    """
    if not os.path.exists(csv_file):
        print(f"No data file found: {csv_file}")
        return

    df = pd.read_csv(csv_file)
    df["timestamp"] = pd.to_datetime(df["timestamp"])

    # Convert stored string prices back to numeric
    df["price"] = pd.to_numeric(df["price"], errors="coerce")

    # Filter by product if URL provided
    if product_url:
        df = df[df["url"] == product_url]

    df = df.dropna(subset=["price"])

    if df.empty:
        print("No data to plot")
        return

    plt.figure(figsize=(12, 6))

    # Plot each product separately
    for url in df["url"].unique():
        product_df = df[df["url"] == url]
        title = product_df["title"].iloc[0] if "title" in product_df.columns else url
        label = title[:40] + "..." if len(title) > 40 else title
        plt.plot(product_df["timestamp"], product_df["price"], marker="o", label=label)

    plt.xlabel("Date")
    plt.ylabel("Price (USD)")
    plt.title("Amazon Price History")
    plt.legend(loc="upper right", fontsize=8)
    plt.xticks(rotation=45)
    plt.grid(True, alpha=0.3)
    plt.tight_layout()

    plt.savefig("price_history.png", dpi=150)
    plt.show()
    print("Chart saved to price_history.png")
def analyze_price_data(csv_file="price_history.csv", product_url=None):
    """
    Calculate price statistics for tracked products.
    Returns a dictionary with price analysis.
    """
    if not os.path.exists(csv_file):
        return None

    df = pd.read_csv(csv_file)

    if product_url:
        df = df[df["url"] == product_url]

    df["price"] = pd.to_numeric(df["price"], errors="coerce")
    prices = df["price"].dropna()

    if prices.empty:
        return None

    return {
        "current_price": float(prices.iloc[-1]),
        "lowest_price": float(prices.min()),
        "highest_price": float(prices.max()),
        "average_price": round(float(prices.mean()), 2),
        "price_range": round(float(prices.max() - prices.min()), 2),
        "data_points": int(len(prices)),
        "is_lowest": bool(prices.iloc[-1] == prices.min())
    }

Run this function periodically to generate updated charts. You can also calculate statistics to supplement the visual data. These statistics tell you whether the current price is a good deal relative to historical data. For example, if the current price equals the lowest recorded price, that's a strong buy signal.

Price history visualization.webp

Step 10: Put It All Together

Here's the complete tracker script combining everything covered above. Save this as tracker.py and customize the product list and email settings for your needs. The script includes all imports, configuration options, helper functions, main scraping logic, email alerts, scheduling, and CSV export.

Want to try it yourself? Here's complete Amazon price tracker script

Run the script with python tracker.py. And the script will keep running as long as the process is running (i.e., in your terminal session). To run it in the background, use Task Scheduler / cron / systemd / Docker.

Final script terminal output.webpEmail alert sent successfully.webp

Troubleshooting Common Issues

Amazon scraping fails in a few common ways, and the error message usually tells you what happened. Check the response you got back first (status code and a short HTML snippet), then adjust your request pattern or parsing based on what you see.

  • "Request failed" or timeout errors: Amazon might be blocking your IP. Wait a few hours before trying again, and if the problem persists, try a different network or use a proxy.
  • Price returns None: Amazon's page structure varies by product category and region. Inspect the actual HTML returned by your request and update your selectors accordingly. Some products display prices differently when logged in versus logged out.
  • 403 Forbidden errors: Your User-Agent or headers look suspicious. Update your User-Agent strings to match current browser versions and make sure you're including all necessary headers.
  • CAPTCHAs appearing: You're making requests too quickly or too predictably. Increase delays between requests, add more randomization, and consider using residential proxies if the problem continues.
  • Script stops after a while: Add comprehensive error handling so one failed request doesn't crash your entire tracker. Wrap your main loop in try-except blocks and log errors instead of letting them propagate.

Wrapping Up

You now have the building blocks for a functional Amazon price tracker. The basic version scrapes prices and stores history. Add email alerts to get notified of price drops, and use proxies to scale up and avoid blocks.

A few tips to keep your tracker running smoothly. Start small with just a few products before scaling up, and monitor your success rate to adjust request timing if blocks increase. Keep your User-Agent strings updated as browsers release new versions, and back up your price history data regularly. For production use with many products, residential proxies become essential for maintaining consistent success rates.

Always review and follow the terms of service and legal requirements that apply to your use case. Scrape responsibly, respect rate limits, and avoid overwhelming servers with excessive requests.


FAQs

Amazon Price Tracker FAQs

FAQs
cookies
Use Cookies
This website uses cookies to enhance user experience and to analyze performance and traffic on our website.
Explore more