swdata package

Submodules

swdata.adapters module

swdata.adapters.request_sw_data(url)[source]

Adapter function for getting data from SWAPI url

swdata.adapters.results_generator(url: str) Generator[Dict, None, None][source]

Result generator that will automatically request next result page until there are no more results

swdata.people module

These are people provider and transformation classes.

class swdata.people.PeopleProvider(url: str = 'https://swapi.dev/api/people/')[source]

Bases: object

Simple api adapter for fetching people data from SWAPI.

get_people() Generator[Dict, None, None][source]
url: str = 'https://swapi.dev/api/people/'
class swdata.people.PeopleSaver(transformer: swdata.people.PeopleTransformator = <factory>, provider: swdata.people.PeopleProvider = <factory>)[source]

Bases: object

An aggregate for handling src people data and saving reports.

provider: swdata.people.PeopleProvider
save(directory: str) swdata.reports.Report[source]
transformer: swdata.people.PeopleTransformator
class swdata.people.PeopleTransformator(planets: swdata.planets.PlanetProvider = <factory>)[source]

Bases: object

Transformation pipeline component to enrich People data and replace home world reference with a name.

fields = ('name', 'height', 'mass', 'hair_color', 'skin_color', 'eye_color', 'birth_year', 'gender', 'homeworld', 'edited', 'url')
planets: swdata.planets.PlanetProvider
transform_people(people: Iterator[Dict]) Iterator[Tuple][source]

swdata.planets module

These are planet data providers.

ATM there is only 60 planets in the pool so this is not a problem to keep them in memory. When circumstances change, and the pool of planets would increase significantly, we can implement a more efficient data provider that would build index an index only once store it only update missing planets when they are requested.

class swdata.planets.EagerPlanetProvider[source]

Bases: swdata.planets.PlanetProvider

Eager provider loading all planets into memory upfront.

get_name(url)[source]
planets
class swdata.planets.LazyPlanetProvider[source]

Bases: swdata.planets.PlanetProvider

Lazy and slow planet provider usable only when the number queries planets is small in comparison to the total number of available planets.

CACHED_PLANETS = 100
CACHE_TIME = 3600
get_name(url)[source]
class swdata.planets.PlanetProvider[source]

Bases: abc.ABC

abstract get_name(url: str)[source]

swdata.reports module

class swdata.reports.Report(path: str, when_saved: datetime.datetime)[source]

Bases: swdata.reports.ReportViewer

Report metadata DTO

when_saved: datetime.datetime
class swdata.reports.ReportViewer(path: str)[source]

Bases: object

Data viewer for reports saved on disc

distinct(*columns)[source]
path: str
swdata.reports.save_csv(path: str, data: Iterator)[source]

Utility function to save data to a CSV file.

swdata.reports.save_report(directory: str, data: Iterator, timestamp_fmt='%Y-%m-%d-%H%M%S') swdata.reports.Report[source]

Utility function create a report file with timestamp in the name

swdata.settings module

Module contents

This a public API for the swdata package.

Everything beside classes declared in this module should be considered private and subject to change without notice. This is an architectural decision to give this module an autonomy over it’s design and implementation.

ATM we are using poor man’s dependency injection delivered by dataclass fields. Preferably we should use a proper Inversion Of Control container engine that would give use the flexibility of defining a default types and setup mappings like the one between PlanetProvider abstract class and a concrete implementation of EagerPlanetProvider — mappings like this should come from the container setup not class declarations.

ATM the control over injected instances is delegated to the field declarations and setup by default_factory.

class swdata.SWPeople(_saver: swdata.people.PeopleSaver = <factory>)[source]

Bases: object

Gateway class for reporting functionality over SW People data.

create_collection(directory='reports') swdata.reports.Report[source]
static get_distinct(path, *columns)[source]
static get_people_data(path, start=0, stop=10)[source]