There’s a saying in the data science world: “garbage in equals garbage out.” 


In other words, the data you feed any algorithm determines the quality of the algorithm’s output. And while this is true for all data science, it’s especially pertinent for dynamic pricing algorithms.


Dynamic pricing tools are like any other algorithm: they need great data as input to to give you a great pricing output. What you put into a dynamic pricing solution matters and has a colossal impact on the price advices it creates. If you have bad competitor data that isn’t up-to-date, for example, then the tool will generate equally bad price advices. 


But what data do you need, and how do you ensure it’s at a high enough quality? Here are four key things you need to know about data quality in pricing. 


1. You need both internal and external data

How much data does a dynamic pricing tool need? 


The answer is: a lot. 


The first (and most obvious) type is the competitor pricing data. This is the price that your competitors advertise their products as on different online shopping channels. We’ll cover this more in the next section, but it’s important to have this data so you can keep your prices aligned with the overall market value. 


But just getting competitor pricing data isn’t enough to have a profitable dynamic pricing strategy. You also need to incorporate internal information like your purchase price and stock levels for every product. Without this internal data, you risk advertising a price below your purchase price, for example, and can lose out on margin as a result. 


This internal data shifts frequently for every product in your assortment, so you can’t plug this data in once and forget about it. If you do, the dynamic pricing tool will continue to make decisions based on flawed data, like old purchase prices or incorrect stock levels. While having some data is better than having no data, improperly managed data creates risks for suboptimal prices. 


2. Competitor pricing data comes from two sources

Competitor pricing data comes from two places in two different formats. 


First, the data comes either from comparison shopping engines or directly from your competition’s website. Each of these sources has its pros and cons.  


Comparison shopping engines are a great place to start because you can estimate the market value of every product. As a marketplace, CSEs give you perspective about how your competitors interact with other market players. With CSE data, you can deduce your competitor’s strategies. You might notice, for example, that Competitor X always prices 10% lower than Competitor Y in electronics products. 


CSEs give you perspective and show you accurate prices for a variety of competitors in one go, but your competitors also won’t advertise every single product in their assortment on a comparison shopping engine. If you want to make sure you match on every product — and get data like stock levels — you need to go a step further and scrape directly from competitor websites. 


Second, the format of that data can either be in a URL or a Global Trade Item Number, better known as a GTIN for short. 


Most often dynamic pricing tools will work with product URLs to match products. Your team will need to keep an accurate database of URLs for every product across every competitor website or CSE, and will need to check the links repeatedly to make sure that the URLs are functional and accurate. If the link breaks and your team doesn’t pick up on it immediately, the dynamic pricing engine won’t find or match that product. 


Most teams don’t have the manpower to keep up with the work required for URL matching. It’s stressful to manage because teams need to devote their limited time and energy to maintaining the URL for every product in their assortment. And when you have hundreds of thousands of products and only 8 hours in a day, resources get directed (understandably) to the high-runner products that sell frequently and are highly elastic. 


But in this scenario your long-tail products get lost and left behind. URLs break, and nobody notices. Your dynamic pricing tool isn’t able to find products and update prices. Your company loses money. 


That’s where a product’s Global Trade Item Number, also known as a GTIN, comes into play. With GTINs, just provide the software with this unique 14-digit code for every product in your assortment. The software can then scan the market for those codes and match prices based on this factor. 


In our experience, we’ve discovered the best way to balance hundreds of competitors and thousands of products is to use a mix of URLs and GTINs. In this blend, you use the main URL for your competitor’s website (such as, then use GTIN codes to search the website for your products. This means you only have one URL to track per competitor, and it’s a URL that is unlikely to break or change. 


This makes the data collection process somewhat more expensive, but it also means your data is consistently high quality and accurate. And the monetary investment in a proper data collection solution up front is typically less than the costs incurred from unmatched products, frustrated teams, and retroactive data validation. 


3. It’s hard to get competitor data

To get competitor or market data, your tool needs to go through a “scraping” process. A tool called a spider will “crawl” the internet and find the information you’re requesting.


Your competitors know that you use a spidering tool to get information from their website. And they’re starting to make it more difficult for crawlers to extract that information. 


How, you ask? One example is by blocking IP addresses entirely. If your crawler uses an IP address to view a website, you leave a trace of your presence with your competition. If your competitor’s website notices the same IP address returning too frequently, it will block that address.


To overcome this, crawlers often use multiple IP addresses to reduce the dependency on one single way of gathering the data. But this isn’t the only way competitors will try to prevent you from gathering pricing data. Crawlers are built to gather information from a website’s design. If that design changes significantly it will confuse the spidering tool. 


For a properly functioning spidering tool, you need a team of people monitoring the e-commerce landscape and updating the tool when these kinds of defenses are put into place. 


4. There are a lot of vendors selling bad data. 

Data collection is an insanely popular and high-demand industry at the moment. Every retailer and brand wants to understand the internet marketplace, and are willing to pay something for that information. 


Entrepreneurs know that. And they want to capitalize on it. 


As with many things, that too-good-to-be-true price is just that: too good to be true. To offer data at an astonishingly low price, vendors skip out on some vital safety checks that keep your data clean, organized, complete, and up-to-date. Some cheap data sources might cut corners like:


  • Automatic updates several times per day
  • Scraping from both comparison shopping engines and competitor websites
  • The use of GTINs in addition to URLs to reduce manual labor
  • Proper tooling designed with your competitor’s defenses in mind
  • System updates and maintenance
  • Consistent development time to improve data collection
  • Extra quality assurance checks for both internal and external data


Without all the above in place, the price advices the dynamic pricing tool creates won’t be as powerful (or accurate) as they could (and should) be. And without consistent development to improve the data collection, a dynamic pricing tool will quickly become obsolete.


Low-quality data is also easy to spot. For many of our customers who come to us with pre-existing data sources, the super users of dynamic pricing tools already knew the data was flimsy. They didn’t trust the price outputs that the system created, and the whole dynamic pricing tool was a waste of an investment up to that point. 


Here’s the thing: proper data collection is, by itself, somewhat expensive. But that’s because there is a ton of work that goes into making sure the data is reliable and usable. Quality assurance checks. Regular testing. Rigorous evaluations of suppliers. And more. 


When you pay more for data and use a quality validation process, you can trust the input that goes into the dynamic pricing tool...and therefore trust the output as well. Your team can relax knowing that the price advices the tool creates are based on accurate market data and understandable business rules. 


Final thoughts

If there’s one thing to take away from this blog, it’s this: you can get the data you need for cheap, but there is zero guarantee on the quality of that data.


Quality data collection takes time, energy, and investment, but the peace of mind it brings (and the price optimization capacity), are well worth the cost.


Is validating all this data worth your time? Absolutely, because without it your dynamic pricing system will be more of a hindrance than a tool. But is it worth investing time and energy (and money) to develop the tools to validate this data in-house? Well...that’s up to you. 


As a retailer or brand, you want to sell your products. That’s what you’re good at, and it’s what you enjoy doing. The purpose of dynamic pricing is to help you achieve that goal by positioning yourself correctly in the market. 


But is dynamic pricing your only responsibility? No. You’re also in charge of procurement, purchasing, marketing, strategy, innovation...the list is endless.


To be honest, your time is better spent focusing on your company’s goals — not worrying about the small (but extremely important) details that could make or break dynamic pricing. It’s much easier (and profitable) for you to outsource that task to an entity that focuses specifically on dynamic pricing and can do all the quality assurance for you. 


If you’re curious how Omnia can help you do that, reach out for a chat. We’re happy to discuss data with you at any point. 


PS - Already using a data provider and don't want to double up on costs?

No problem. At Omnia you can connect your existing data provider to our system, have the data checked, and enrich it with data from our trusted partners. 


Interested? Reach out today to ask our team how it works (and try it free for two weeks). Click the button below to get started.