A project is a vehicle through which customer requirements are translated into workable data and value is delivered. Project requirements include
- Data requirements such as URLs to source data from, data points to extract, data structure & format, and additional instructions such as specific categories of data to source.
- Delivery requirements such as frequency at which data should be pulled and delivery destinations.
Projects requiring data to be pulled only once are referred to as one-time. Time sensitive data prone to frequent changes can be pulled as often as weekly, daily, or even hourly. These projects are referred to as recurring projects.
Once the requirements are vetted and clear, they are either executed together in a single report or divided into granular sets of data and delivery requirements in multiple reports. How data should be organized is primarily driven by the customer’s requirement with Grepsr providing consultation if needed.
Q: Is web scraping legal?
A: Scraping publicly available data is perfectly legal so long 1) it does not violate the source site’s terms of service, 2) data is not copyrighted, and 3) data does not contain Personally Identifiable Information (or PII). Fair to say, this is a contested and misunderstood topic. You can read more about the legalities of web scraping in our blog here.
Q: Any website or data can be scraped?
A: Our solution expertise and advanced data infrastructure allows us to pull data from 1000s of sources and work around complex security controls. However, compliance with our data policy is a must before a project is taken forward.
The Projects menu in the left navigation routes to a page that contains all the projects in the customer’s account. Most recent projects are shown first. The search bar at the top left of the grid is handy in large accounts with several projects. It searches across multiple fields for a project and even shows close matches as opposed to exact matches only.
Fig: List of projects in an account
From the project list, the project name links the project workspace. For a fully executed project, the workspace contains one or more reports with data, a messaging section for project conversation, quality dashboard to keep an eye on data quality, and report-specific configurations for scheduling and data delivery. For newer projects that are still being vetted, the workspace is primarily limited to the messaging section which also serves as the landing page for the project.
Topics in this section: