pycdfpp package

pycdfpp

Indices and tables

class pycdfpp.Attribute

Bases: pybind11_object

property name
set_values(entries_values=None, entries_types=None)

Sets the values of the attribute.

This method can be called in two ways: 1. With values and optional types: set_values(entries_values, entries_types=None) 2. With another Attribute object: set_values(attribute)

Parameters:
entries_valuesList[np.ndarray or List[float or int or datetime] or str]

The values entries to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.

entries_typesList[DataType] or None, optional

The data type for each entry of the attribute. If None, the data type is inferred from the values. (Default is None)

attributeAttribute

An existing Attribute object to set the values from (for the second calling method).

type(self: pycdfpp._pycdfpp.Attribute, arg0: SupportsInt) pycdfpp._pycdfpp.DataType
class pycdfpp.CDF

Bases: pybind11_object

A CDF file object.

Attributes:
attributes: dict

file attributes

variables: dict

file variables

majority: cdf_majority

file majority

distribution_version: int

file distribution version

lazy_loaded: bool

file lazy loading state

compression: CompressionType

file compression type

Methods

add_attribute([name, entries_values, ...])

Adds a new attribute to the CDF.

add_variable([name, values, data_type, ...])

Adds a new variable to the CDF.

add_attribute(name=None, entries_values=None, entries_types=None) Attribute

Adds a new attribute to the CDF.

This method can be called in two ways: 1. With attribute parameters: add_attribute(name, entries_values, entries_types=None) 2. With an Attribute object: add_attribute(attribute)

Parameters:
namestr

The name of the attribute to add.

entries_valuesList[np.ndarray or List[float or int or datetime] or str]

The values entries to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.

entries_typesList[DataType] or None, optional

The data type for each entry of the attribute. If None, the data type is inferred from the values. (Default is None)

attributeAttribute

An existing Attribute object to add to the CDF (for the second calling method).

Returns:
Attribute or None

Returns the newly created attribute if successful. Otherwise, returns None.

Raises:
ValueError

If the attribute already exists.

Examples

>>> from pycdfpp import CDF, DataType
>>> import numpy as np
>>> from datetime import datetime
>>> cdf = CDF()
>>> # First method: creating a new attribute with parameters
>>> cdf.add_attribute("attr1", [np.arange(10, dtype=np.int32)], [DataType.CDF_INT4])
attr1: [ [ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] ] ]
>>> # Second method: adding an existing attribute
>>> cdf2 = CDF()
>>> cdf2.add_attribute(cdf.attributes["attr1"])
attr1: [ [ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] ] ]
>>> # Another example with multiple entries of different types
>>> cdf.add_attribute("multi", [np.arange(2, dtype=np.int32), [1.,2.,3.], "hello", [datetime(2010,1,1), datetime(2020,1,1)]])
multi: [ [ [ 0, 1 ], [ 1, 2, 3 ], "hello", [ 2010-01-01T00:00:00.000000000, 2020-01-01T00:00:00.000000000 ] ] ]
add_variable(name=None, values=None, data_type=None, is_nrv=False, compression=CompressionType.no_compression, attributes=None) Variable

Adds a new variable to the CDF.

This method can be called in two ways: 1. With variable parameters: add_variable(name, values=None, data_type=None, is_nrv=False, compression=CompressionType.no_compression, attributes=None) 2. With a Variable object: add_variable(variable)

Parameters:
namestr

The name of the variable to add.

valuesnumpy.ndarray or list or None, optional

The values to set for the variable. If None, the variable is created with no values. (Default is None) When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.

data_typeDataType or None, optional

The data type of the variable. If None, the data type is inferred from the values. (Default is None)

is_nrvbool, optional

Whether or not the variable is a non-record variable. (Default is False)

compressionCompressionType, optional

The compression type to use for the variable. (Default is CompressionType.no_compression)

attributesMapping[str, List[Any]] or None, optional

The attributes to set for the variable. If None, the variable is created with no attributes. (Default is None)

variableVariable

An existing Variable object to add to the CDF (for the second calling method).

Returns:
Variable or None

Returns the newly created variable if successful. Otherwise, returns None.

Raises:
ValueError

If the variable already exists.

Examples

>>> from pycdfpp import CDF, DataType, CompressionType
>>> import numpy as np
>>> cdf = CDF()
>>> # First method: creating a new variable with parameters
>>> cdf.add_variable("var1", np.arange(10, dtype=np.int32), DataType.CDF_INT4, compression=CompressionType.gzip_compression)
var1:
  shape: [ 10 ]
  type: CDF_INT1
  record varry: True
  compression: GNU GZIP
  ...
>>> # Second method: adding an existing variable
>>> cdf2 = CDF()
>>> cdf2.add_variable(cdf["var1"])  # Assuming var1 is already defined in cdf (from the first method)
var1:
  shape: [ 5 ]
  type: CDF_INT1
  record varry: True
  compression: GNU GZIP
  ...
property attributes
property compression
property distribution_version
filter(variables: List[str] | str | Pattern | Callable[[Variable], bool] = None, attributes: List[str] | str | Pattern | Callable[[Attribute], bool] = None, inplace=False) CDF

Filters the CDF object based on the provided criteria. Parameters ———- cdf : CDF

The CDF object to filter.

variablesUnion[List[str], str, re.Pattern, Callable[[Variable], bool]], optional

A list of variable names to keep, a regex pattern, or a callable that returns True for variables to keep. If None (default), no variables are kept.

attributesUnion[List[str], str, re.Pattern, Callable[[Attribute], bool]], optional

A list of attribute names to keep, a regex pattern, or a callable that returns True for attributes to keep. If None (default), no attributes are kept.

inplacebool, optional

If True, modifies the original CDF object. If False, returns a new filtered CDF object. (Default is False)

Returns

CDF

Returns a new CDF object with the filtered variables and attributes.

items(self: pycdfpp._pycdfpp.CDF) collections.abc.Iterator[tuple[str, pycdfpp._pycdfpp.Variable]]
keys(self: pycdfpp._pycdfpp.CDF) list[str]
property lazy_loaded
property majority
class pycdfpp.CompressionType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

ahuff_compression = 3
gzip_compression = 5
huff_compression = 2
no_compression = 0
rle_compression = 1
class pycdfpp.DataType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

CDF_BYTE = 41
CDF_CHAR = 51
CDF_DOUBLE = 45
CDF_EPOCH = 31
CDF_EPOCH16 = 32
CDF_FLOAT = 44
CDF_INT1 = 1
CDF_INT2 = 2
CDF_INT4 = 4
CDF_INT8 = 8
CDF_NONE = 0
CDF_REAL4 = 21
CDF_REAL8 = 22
CDF_TIME_TT2000 = 33
CDF_UCHAR = 52
CDF_UINT1 = 11
CDF_UINT2 = 12
CDF_UINT4 = 14
class pycdfpp.Majority(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

column = 0
row = 1
class pycdfpp.Variable

Bases: pybind11_object

A CDF Variable (either R or Z variable)

Attributes:
attributes: dict

variable attributes

name: str

variable name

type: DataType

variable data type (ie CDF_DOUBLE, CDF_TIME_TT2000, …)

shape: List[int]

variable shape (records + record shape)

majority: cdf_majority

variable majority as writen in the CDF file, note that pycdfpp will always expose row major data.

values_loaded: bool

True if values are availbale in memory, this is usefull with lazy loading to know if values are already loaded.

compression: CompressionType

variable compression type (supported values are no_compression, rle_compression, gzip_compression)

values: numpy.array

returns variable values as a numpy.array of the corresponding dtype and shape, note that no copies are involved, the returned array is just a view on variable data.

values_encoded: numpy.array

same as values except that string variable are encoded wihch involves a data copy and since numpy uses UTF-32, expect a 4x memory increase for string values

Methods

add_attribute([name, values, data_type])

Adds a new attribute to the variable.

set_values(values[, data_type, force])

Sets or resets the values of the variable.

set_compression_type

Sets the variable compression type

add_attribute(name=None, values=None, data_type=None) VariableAttribute

Adds a new attribute to the variable.

This method can be called in two ways: 1. With attribute parameters: add_attribute(name, values, data_type=None) 2. With a VariableAttribute object: add_attribute(attribute)

Parameters:
namestr

The name of the attribute to add.

valuesnp.ndarray or List[float or int or datetime] or str

The values to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.

data_typeDataType or None, optional

The data type of the attribute. If None, the data type is inferred from the values. (Default is None)

attributeVariableAttribute

An existing VariableAttribute object to add to the variable (for the second calling method).

Returns:
VariableAttribute

Returns the newly created attribute if successful.

Raises:
ValueError

If the attribute already exists.

Examples

>>> from pycdfpp import CDF, DataType
>>> import numpy as np
>>> cdf = CDF()
>>> cdf.add_variable("var1", np.arange(10, dtype=np.int32), DataType.CDF_INT4)
var1:
  shape: [ 10 ]
  type: CDF_INT1
  record varry: True
  compression: None
  ...
>>> # First method: creating a new attribute with parameters
>>> cdf["var1"].add_attribute("attr1", np.arange(10, dtype=np.int32), DataType.CDF_INT4)
attr1: [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
>>> # Second method: adding an existing attribute
>>> var2 = cdf.add_variable("var2", np.arange(5))
>>> var2.add_attribute(cdf["var1"].attributes["attr1"])
attr1: [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
property attributes
property compression
property is_nrv
property majority
property name
set_values(values, data_type=None, force=False)

Sets or resets the values of the variable.

Parameters:
valuesnumpy.ndarray or list or tuple or Variable

The values to set for the variable.

data_typeDataType or None, optional

The data type of the variable. If None, the data type is inferred from the values. (Default is None) When passing integer values as a list or tuple, it will choose the smallest data type that can hold all the values. When passing a Variable, the data type is taken from the Variable.

forcebool, optional

If True, allows to overwrite existing values even if the shape or data type do not match. (Default is False)

Returns
——-
None
Raises
——
ValueError

If the shape or data type do not match and force is False.

Examples
——–
>>> from pycdfpp import CDF, DataType
>>> import numpy as np
>>> cdf = CDF()
>>> cdf.add_variable(“var1”)
var1:

shape: [ ] type: CDF_NONE record vary: True compression: None

>>> # Setting values with numpy array
>>> cdf[“var1”].set_values(np.arange(10, 20, dtype=np.int32))
>>> cdf[“var1”].values
array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], dtype=int32)
property shape
property type
property values
property values_encoded
property values_loaded
class pycdfpp.epoch

Bases: pybind11_object

property mseconds
class pycdfpp.epoch16

Bases: pybind11_object

property picoseconds
property seconds
pycdfpp.load(file_or_buffer: str, iso_8859_1_to_utf8: bool = True, lazy_load: bool = True)[source]

Load and parse a CDF file.

Parameters:
file_or_bufferstr or ByteString

Either a filename to be loaded or an in-memory file implementing the Python buffer protocol.

iso_8859_1_to_utf8bool, optional

Automatically convert Latin-1 characters to their equivalent UTF counterparts when True. For CDF files prior to version 3.8, UTF-8 wasn’t supported and some CDF files might contain “illegal” Latin-1 characters. This option has no impact on valid UTF-8 characters. (Default is True)

lazy_loadbool, optional

Controls whether variable values are loaded immediately or only when accessed by the user. If True, variables’ values are loaded on demand. If False, all variable values are loaded during parsing. (Default is True)

Returns:
CDF or None

Returns a CDF object upon successful read. If there’s an issue with the read, None is returned.

pycdfpp.save(*args, **kwargs)

Overloaded function.

  1. save(cdf: pycdfpp._pycdfpp.CDF, fname: str) -> bool

  2. save(cdf: pycdfpp._pycdfpp.CDF) -> pycdfpp._pycdfpp._cdf_bytes

pycdfpp.to_datetime(values)[source]

to_datetime

Parameters:
values: Variable or epoch or List[epoch] or epoch16 or List[epoch16] or tt2000_t or List[tt2000_t] or numpy.array[numpy.datetime64[ns]]

input value(s)

to convert to datetime.datetime
Returns:
List[datetime.datetime]
Raises:
TypeError or IndexError

If the input values are not compatible time types.

pycdfpp.to_datetime64(values)[source]

Convert any compatible given collection of time values to a numpy.datetime64 array.

Parameters:
values: Variable or epoch or List[epoch] or numpy.ndarray[epoch] or epoch16 or List[epoch16] or numpy.array[epoch16] or tt2000_t or List[tt2000_t] or numpy.array[tt2000_t]

input value(s)

to convert to numpy.datetime64
Returns:
numpy.ndarray[numpy.datetime64]
Raises:
TypeError or IndexError

If the input values are not compatible time types.

pycdfpp.to_time_string(values, format: str)[source]

Format CDF time values as an array of fixed-width ASCII strings.

Parameters:
valuesVariable or numpy.ndarray[tt2000_t] or numpy.ndarray[epoch] or numpy.ndarray[epoch16]

CDF time values to format.

formatstr

strftime-compatible format string (e.g. '%Y-%m-%dT%H:%M:%SZ'). %S automatically includes sub-second digits matching the input precision (3 for epoch, 9 for tt2000, 12 for epoch16).

Returns:
numpy.ndarray

Array of byte strings (dtype S{N}) with the same shape as input.

class pycdfpp.tt2000_t

Bases: pybind11_object

property nseconds