pycdfpp package

pycdfpp

Indices and tables

class pycdfpp.Attribute

Bases: pybind11_object

property name

set_values(entries_values=None, entries_types=None)

Sets the values of the attribute.

This method can be called in two ways: 1. With values and optional types: set_values(entries_values, entries_types=None) 2. With another Attribute object: set_values(attribute)

Parameters:

entries_valuesList[np.ndarray or List[float or int or datetime] or str]: The values entries to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
entries_typesList[DataType] or None, optional: The data type for each entry of the attribute. If None, the data type is inferred from the values. (Default is None)
attributeAttribute: An existing Attribute object to set the values from (for the second calling method).

type(self: pycdfpp._pycdfpp.Attribute, arg0: SupportsInt) → pycdfpp._pycdfpp.DataType

class pycdfpp.CDF

Bases: pybind11_object

A CDF file object.

Attributes:

attributes: dict: file attributes
variables: dict: file variables
majority: cdf_majority: file majority
distribution_version: int: file distribution version
lazy_loaded: bool: file lazy loading state
compression: CompressionType: file compression type

Methods

`add_attribute`([name, entries_values, ...])	Adds a new attribute to the CDF.
`add_variable`([name, values, data_type, ...])	Adds a new variable to the CDF.

add_attribute(name=None, entries_values=None, entries_types=None) → Attribute

Adds a new attribute to the CDF.

This method can be called in two ways: 1. With attribute parameters: add_attribute(name, entries_values, entries_types=None) 2. With an Attribute object: add_attribute(attribute)

Parameters:

namestr: The name of the attribute to add.
entries_valuesList[np.ndarray or List[float or int or datetime] or str]: The values entries to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
entries_typesList[DataType] or None, optional: The data type for each entry of the attribute. If None, the data type is inferred from the values. (Default is None)
attributeAttribute: An existing Attribute object to add to the CDF (for the second calling method).

Returns:

Attribute or None: Returns the newly created attribute if successful. Otherwise, returns None.

Raises:

ValueError: If the attribute already exists.

Examples

>>> from pycdfpp import CDF, DataType
>>> import numpy as np
>>> from datetime import datetime
>>> cdf = CDF()
>>> # First method: creating a new attribute with parameters
>>> cdf.add_attribute("attr1", [np.arange(10, dtype=np.int32)], [DataType.CDF_INT4])
attr1: [ [ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] ] ]
>>> # Second method: adding an existing attribute
>>> cdf2 = CDF()
>>> cdf2.add_attribute(cdf.attributes["attr1"])
attr1: [ [ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] ] ]
>>> # Another example with multiple entries of different types
>>> cdf.add_attribute("multi", [np.arange(2, dtype=np.int32), [1.,2.,3.], "hello", [datetime(2010,1,1), datetime(2020,1,1)]])
multi: [ [ [ 0, 1 ], [ 1, 2, 3 ], "hello", [ 2010-01-01T00:00:00.000000000, 2020-01-01T00:00:00.000000000 ] ] ]

add_variable(name=None, values=None, data_type=None, is_nrv=False, compression=CompressionType.no_compression, attributes=None) → Variable

Adds a new variable to the CDF.

This method can be called in two ways: 1. With variable parameters: add_variable(name, values=None, data_type=None, is_nrv=False, compression=CompressionType.no_compression, attributes=None) 2. With a Variable object: add_variable(variable)

Parameters:

namestr: The name of the variable to add.
valuesnumpy.ndarray or list or None, optional: The values to set for the variable. If None, the variable is created with no values. (Default is None) When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
data_typeDataType or None, optional: The data type of the variable. If None, the data type is inferred from the values. (Default is None)
is_nrvbool, optional: Whether or not the variable is a non-record variable. (Default is False)
compressionCompressionType, optional: The compression type to use for the variable. (Default is CompressionType.no_compression)
attributesMapping[str, List[Any]] or None, optional: The attributes to set for the variable. If None, the variable is created with no attributes. (Default is None)
variableVariable: An existing Variable object to add to the CDF (for the second calling method).

Returns:

Variable or None: Returns the newly created variable if successful. Otherwise, returns None.

Raises:

ValueError: If the variable already exists.

Examples

>>> from pycdfpp import CDF, DataType, CompressionType
>>> import numpy as np
>>> cdf = CDF()
>>> # First method: creating a new variable with parameters
>>> cdf.add_variable("var1", np.arange(10, dtype=np.int32), DataType.CDF_INT4, compression=CompressionType.gzip_compression)
var1:
  shape: [ 10 ]
  type: CDF_INT1
  record varry: True
  compression: GNU GZIP
  ...
>>> # Second method: adding an existing variable
>>> cdf2 = CDF()
>>> cdf2.add_variable(cdf["var1"])  # Assuming var1 is already defined in cdf (from the first method)
var1:
  shape: [ 5 ]
  type: CDF_INT1
  record varry: True
  compression: GNU GZIP
  ...

property attributes

property compression

property distribution_version

Filters the CDF object based on the provided criteria. Parameters ———- cdf : CDF

The CDF object to filter.

variablesUnion[List[str], str, re.Pattern, Callable[[Variable], bool]], optional: A list of variable names to keep, a regex pattern, or a callable that returns True for variables to keep. If None (default), no variables are kept.
attributesUnion[List[str], str, re.Pattern, Callable[[Attribute], bool]], optional: A list of attribute names to keep, a regex pattern, or a callable that returns True for attributes to keep. If None (default), no attributes are kept.
inplacebool, optional: If True, modifies the original CDF object. If False, returns a new filtered CDF object. (Default is False)

Returns

CDF: Returns a new CDF object with the filtered variables and attributes.

items(self: pycdfpp._pycdfpp.CDF) → collections.abc.Iterator[tuple[str, pycdfpp._pycdfpp.Variable]]

keys(self: pycdfpp._pycdfpp.CDF) → list[str]

property lazy_loaded

property majority

class pycdfpp.CompressionType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

ahuff_compression = 3

gzip_compression = 5

huff_compression = 2

no_compression = 0

rle_compression = 1

class pycdfpp.DataType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

CDF_BYTE = 41

CDF_CHAR = 51

CDF_DOUBLE = 45

CDF_EPOCH = 31

CDF_EPOCH16 = 32

CDF_FLOAT = 44

CDF_INT1 = 1

CDF_INT2 = 2

CDF_INT4 = 4

CDF_INT8 = 8

CDF_NONE = 0

CDF_REAL4 = 21

CDF_REAL8 = 22

CDF_TIME_TT2000 = 33

CDF_UCHAR = 52

CDF_UINT1 = 11

CDF_UINT2 = 12

CDF_UINT4 = 14

class pycdfpp.Majority(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

column = 0

row = 1

class pycdfpp.Variable

Bases: pybind11_object

A CDF Variable (either R or Z variable)

Attributes:

attributes: dict: variable attributes
name: str: variable name
type: DataType: variable data type (ie CDF_DOUBLE, CDF_TIME_TT2000, …)
shape: List[int]: variable shape (records + record shape)
majority: cdf_majority: variable majority as writen in the CDF file, note that pycdfpp will always expose row major data.
values_loaded: bool: True if values are availbale in memory, this is usefull with lazy loading to know if values are already loaded.
compression: CompressionType: variable compression type (supported values are no_compression, rle_compression, gzip_compression)
values: numpy.array: returns variable values as a numpy.array of the corresponding dtype and shape, note that no copies are involved, the returned array is just a view on variable data.
values_encoded: numpy.array: same as values except that string variable are encoded wihch involves a data copy and since numpy uses UTF-32, expect a 4x memory increase for string values

Methods

`add_attribute`([name, values, data_type])	Adds a new attribute to the variable.
`set_values`(values[, data_type, force])	Sets or resets the values of the variable.

set_compression_type

Sets the variable compression type

add_attribute(name=None, values=None, data_type=None) → VariableAttribute

Adds a new attribute to the variable.

This method can be called in two ways: 1. With attribute parameters: add_attribute(name, values, data_type=None) 2. With a VariableAttribute object: add_attribute(attribute)

Parameters:

namestr: The name of the attribute to add.
valuesnp.ndarray or List[float or int or datetime] or str: The values to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
data_typeDataType or None, optional: The data type of the attribute. If None, the data type is inferred from the values. (Default is None)
attributeVariableAttribute: An existing VariableAttribute object to add to the variable (for the second calling method).

Returns:

VariableAttribute: Returns the newly created attribute if successful.

Raises:

ValueError: If the attribute already exists.

Examples

>>> from pycdfpp import CDF, DataType
>>> import numpy as np
>>> cdf = CDF()
>>> cdf.add_variable("var1", np.arange(10, dtype=np.int32), DataType.CDF_INT4)
var1:
  shape: [ 10 ]
  type: CDF_INT1
  record varry: True
  compression: None
  ...
>>> # First method: creating a new attribute with parameters
>>> cdf["var1"].add_attribute("attr1", np.arange(10, dtype=np.int32), DataType.CDF_INT4)
attr1: [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
>>> # Second method: adding an existing attribute
>>> var2 = cdf.add_variable("var2", np.arange(5))
>>> var2.add_attribute(cdf["var1"].attributes["attr1"])
attr1: [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]

property attributes

property compression

is_contiguous(self: pycdfpp._pycdfpp.Variable) → bool: Whether the variable’s records are stored as a single contiguous block in the file (True) or fragmented across several VVR/CVVR blocks (False). Walks the variable’s index records on first call, then caches the result.

property is_nrv

property is_zvariable

property majority

property name

set_values(values, data_type=None, force=False)

Sets or resets the values of the variable.

Parameters:

valuesnumpy.ndarray or list or tuple or Variable: The values to set for the variable.
data_typeDataType or None, optional: The data type of the variable. If None, the data type is inferred from the values. (Default is None) When passing integer values as a list or tuple, it will choose the smallest data type that can hold all the values. When passing a Variable, the data type is taken from the Variable.
forcebool, optional: If True, allows to overwrite existing values even if the shape or data type do not match. (Default is False)
Returns
——-
None
Raises
——
ValueError: If the shape or data type do not match and force is False.
Examples
——–
>>> from pycdfpp import CDF, DataType
>>> import numpy as np
>>> cdf = CDF()
>>> cdf.add_variable(“var1”)
var1:: shape: [ ] type: CDF_NONE record vary: True compression: None

…
>>> # Setting values with numpy array
>>> cdf[“var1”].set_values(np.arange(10, 20, dtype=np.int32))
>>> cdf[“var1”].values
array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], dtype=int32)

property shape

property type

property values

property values_encoded

property values_loaded

class pycdfpp.epoch

Bases: pybind11_object

property mseconds

class pycdfpp.epoch16

Bases: pybind11_object

property picoseconds

property seconds

pycdfpp.load(file_or_buffer: str, iso_8859_1_to_utf8: bool = True, lazy_load: bool = True)[source]

Load and parse a CDF file.

Parameters:

file_or_bufferstr or ByteString: Either a filename to be loaded or an in-memory file implementing the Python buffer protocol.
iso_8859_1_to_utf8bool, optional: Automatically convert Latin-1 characters to their equivalent UTF counterparts when True. For CDF files prior to version 3.8, UTF-8 wasn’t supported and some CDF files might contain “illegal” Latin-1 characters. This option has no impact on valid UTF-8 characters. (Default is True)
lazy_loadbool, optional: Controls whether variable values are loaded immediately or only when accessed by the user. If True, variables’ values are loaded on demand. If False, all variable values are loaded during parsing. (Default is True)

Returns:

CDF or None: Returns a CDF object upon successful read. If there’s an issue with the read, None is returned.

pycdfpp.save(*args, **kwargs)

Overloaded function.

save(cdf: pycdfpp._pycdfpp.CDF, fname: str) -> bool
save(cdf: pycdfpp._pycdfpp.CDF) -> pycdfpp._pycdfpp._cdf_bytes

pycdfpp.to_datetime(values)[source]

to_datetime

Parameters:

values: Variable or epoch or List[epoch] or epoch16 or List[epoch16] or tt2000_t or List[tt2000_t] or numpy.array[numpy.datetime64[ns]]: input value(s)
to convert to datetime.datetime

Returns:

List[datetime.datetime]

Raises:

TypeError or IndexError: If the input values are not compatible time types.

pycdfpp.to_datetime64(values)[source]

Convert any compatible given collection of time values to a numpy.datetime64 array.

Parameters:

values: Variable or epoch or List[epoch] or numpy.ndarray[epoch] or epoch16 or List[epoch16] or numpy.array[epoch16] or tt2000_t or List[tt2000_t] or numpy.array[tt2000_t]: input value(s)
to convert to numpy.datetime64

Returns:

numpy.ndarray[numpy.datetime64]

Raises:

TypeError or IndexError: If the input values are not compatible time types.

pycdfpp.to_time_string(values, format: str)[source]

Format CDF time values as an array of fixed-width ASCII strings.

Parameters:

valuesVariable or numpy.ndarray[tt2000_t] or numpy.ndarray[epoch] or numpy.ndarray[epoch16]: CDF time values to format.
formatstr: strftime-compatible format string (e.g. '%Y-%m-%dT%H:%M:%SZ'). %S automatically includes sub-second digits matching the input precision (3 for epoch, 9 for tt2000, 12 for epoch16).

Returns:

numpy.ndarray: Array of byte strings (dtype S{N}) with the same shape as input.

class pycdfpp.tt2000_t

Bases: pybind11_object

property nseconds