DataSet Containers

This module contains DataSet implementations used to hold Importables.

The DataSet concrete classes should be used when running import algorithms as the destination. Before starting the import it should contain the initial data found in the destination system and after running the import it will contain the newly synchronized data. From this point of view this structure serves as both input and output argument for the algorithm.

class importtools.datasets.DataSet[source]

An abc that represents a mutable set of elements.

This class serves as documentation of the methods a DataSet should implement. For concrete implementations available in this module see SimpleDataSet and RecordingDataSet.

A DataSet is very similar with a normal set the difference being that you can get() an element. This is useful is because if the elements are Importable instances even if they are equal (the natural keys are the same) the contents may be different.

__iter__()[source]

Iterate over all the content of this set.

get(element, default=None)[source]

Return an equal element from the dataset or the default value.

add(element)[source]

Add or replace the element in the dataset.

pop(element, default=None)[source]

Remove and return an equal element from the dataset.

sync(iterable)[source]

Add, remove and update this elements with those in the iterable.

class importtools.datasets.SimpleDataSet(data_loader=None, *args, **kwargs)[source]

Bases: dict, importtools.datasets.DataSet

A simple dict-based DataSet implementation.

At first, a newly created instance has no elements:

>>> from importtools import Importable
>>> i1, i2, i3 = Importable(0), Importable(0), Importable(1)
>>> sds = SimpleDataSet()
>>> list(sds)
[]

After creation, it can be populated and the elements in the dataset can be retrieved using other equal elements. Trying to get an inexistent item should return the default value or None:

>>> sds.add(i1)
>>> sds.get(i1) is i1
True
>>> sds.get(i2) is i1
True
>>> sds.get(i3) is None
True
>>> sds.get(i3, 'default')
'default'
>>> sds.pop(i3, 'default')
'default'
>>> sds.pop(i1)
Importable(0)

An iterable containing the initial data can be passed when constructing intances:

>>> SimpleDataSet((i1, i3))
SimpleDataSet([Importable(0), Importable(1)])

A ValueError should be raised if the initial data contains duplicates:

>>> init_values = (i1, i2, i3)
>>> SimpleDataSet(init_values) 
Traceback (most recent call last):
ValueError:
class importtools.datasets.RecordingDataSet(data_loader=(), *args, **kwargs)[source]

Bases: importtools.datasets.SimpleDataSet

A DataSet implementation that remembers all the changes, additions and removals done to it.

Using instances of this calss as the destination of the import algorithm allows optimal persistence of the changes by grouping them in a way suited for batch processing.

reset()[source]

Forget all recorded changes.

Calling this method will empty out added, removed and changed.

added[source]

An iterable of all added elements in the dataset.

removed[source]

An iterable of all removed elements in the dataset.

changed[source]

An iterable of all elements that have been changed.

Only the elements that were part of the set from the beginning or before the last call to reset will be tracked. Deleting an element that has changed will not remove it from this list. This means it’s possible for an element to be present in both changed and removed iterables.

Previous topic

Importable Elements

This Page