pycdfpp package
pycdfpp
Indices and tables
- class pycdfpp.Attribute
Bases:
pybind11_object- property name
- set_values(entries_values=None, entries_types=None)
Sets the values of the attribute.
This method can be called in two ways: 1. With values and optional types: set_values(entries_values, entries_types=None) 2. With another Attribute object: set_values(attribute)
- Parameters:
- entries_valuesList[np.ndarray or List[float or int or datetime] or str]
The values entries to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
- entries_typesList[DataType] or None, optional
The data type for each entry of the attribute. If None, the data type is inferred from the values. (Default is None)
- attributeAttribute
An existing Attribute object to set the values from (for the second calling method).
- type(self: pycdfpp._pycdfpp.Attribute, arg0: SupportsInt) pycdfpp._pycdfpp.DataType
- class pycdfpp.CDF
Bases:
pybind11_objectA CDF file object.
- Attributes:
- attributes: dict
file attributes
- variables: dict
file variables
- majority: cdf_majority
file majority
- distribution_version: int
file distribution version
- lazy_loaded: bool
file lazy loading state
- compression: CompressionType
file compression type
Methods
add_attribute([name, entries_values, ...])Adds a new attribute to the CDF.
add_variable([name, values, data_type, ...])Adds a new variable to the CDF.
- add_attribute(name=None, entries_values=None, entries_types=None) Attribute
Adds a new attribute to the CDF.
This method can be called in two ways: 1. With attribute parameters: add_attribute(name, entries_values, entries_types=None) 2. With an Attribute object: add_attribute(attribute)
- Parameters:
- namestr
The name of the attribute to add.
- entries_valuesList[np.ndarray or List[float or int or datetime] or str]
The values entries to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
- entries_typesList[DataType] or None, optional
The data type for each entry of the attribute. If None, the data type is inferred from the values. (Default is None)
- attributeAttribute
An existing Attribute object to add to the CDF (for the second calling method).
- Returns:
- Attribute or None
Returns the newly created attribute if successful. Otherwise, returns None.
- Raises:
- ValueError
If the attribute already exists.
Examples
>>> from pycdfpp import CDF, DataType >>> import numpy as np >>> from datetime import datetime >>> cdf = CDF() >>> # First method: creating a new attribute with parameters >>> cdf.add_attribute("attr1", [np.arange(10, dtype=np.int32)], [DataType.CDF_INT4]) attr1: [ [ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] ] ] >>> # Second method: adding an existing attribute >>> cdf2 = CDF() >>> cdf2.add_attribute(cdf.attributes["attr1"]) attr1: [ [ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] ] ] >>> # Another example with multiple entries of different types >>> cdf.add_attribute("multi", [np.arange(2, dtype=np.int32), [1.,2.,3.], "hello", [datetime(2010,1,1), datetime(2020,1,1)]]) multi: [ [ [ 0, 1 ], [ 1, 2, 3 ], "hello", [ 2010-01-01T00:00:00.000000000, 2020-01-01T00:00:00.000000000 ] ] ]
- add_variable(name=None, values=None, data_type=None, is_nrv=False, compression=CompressionType.no_compression, attributes=None) Variable
Adds a new variable to the CDF.
This method can be called in two ways: 1. With variable parameters: add_variable(name, values=None, data_type=None, is_nrv=False, compression=CompressionType.no_compression, attributes=None) 2. With a Variable object: add_variable(variable)
- Parameters:
- namestr
The name of the variable to add.
- valuesnumpy.ndarray or list or None, optional
The values to set for the variable. If None, the variable is created with no values. (Default is None) When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
- data_typeDataType or None, optional
The data type of the variable. If None, the data type is inferred from the values. (Default is None)
- is_nrvbool, optional
Whether or not the variable is a non-record variable. (Default is False)
- compressionCompressionType, optional
The compression type to use for the variable. (Default is CompressionType.no_compression)
- attributesMapping[str, List[Any]] or None, optional
The attributes to set for the variable. If None, the variable is created with no attributes. (Default is None)
- variableVariable
An existing Variable object to add to the CDF (for the second calling method).
- Returns:
- Variable or None
Returns the newly created variable if successful. Otherwise, returns None.
- Raises:
- ValueError
If the variable already exists.
Examples
>>> from pycdfpp import CDF, DataType, CompressionType >>> import numpy as np >>> cdf = CDF() >>> # First method: creating a new variable with parameters >>> cdf.add_variable("var1", np.arange(10, dtype=np.int32), DataType.CDF_INT4, compression=CompressionType.gzip_compression) var1: shape: [ 10 ] type: CDF_INT1 record varry: True compression: GNU GZIP ... >>> # Second method: adding an existing variable >>> cdf2 = CDF() >>> cdf2.add_variable(cdf["var1"]) # Assuming var1 is already defined in cdf (from the first method) var1: shape: [ 5 ] type: CDF_INT1 record varry: True compression: GNU GZIP ...
- property attributes
- property compression
- property distribution_version
- filter(variables: List[str] | str | Pattern | Callable[[Variable], bool] = None, attributes: List[str] | str | Pattern | Callable[[Attribute], bool] = None, inplace=False) CDF
Filters the CDF object based on the provided criteria. Parameters ———- cdf : CDF
The CDF object to filter.
- variablesUnion[List[str], str, re.Pattern, Callable[[Variable], bool]], optional
A list of variable names to keep, a regex pattern, or a callable that returns True for variables to keep. If None (default), no variables are kept.
- attributesUnion[List[str], str, re.Pattern, Callable[[Attribute], bool]], optional
A list of attribute names to keep, a regex pattern, or a callable that returns True for attributes to keep. If None (default), no attributes are kept.
- inplacebool, optional
If True, modifies the original CDF object. If False, returns a new filtered CDF object. (Default is False)
Returns
- CDF
Returns a new CDF object with the filtered variables and attributes.
- items(self: pycdfpp._pycdfpp.CDF) collections.abc.Iterator[tuple[str, pycdfpp._pycdfpp.Variable]]
- keys(self: pycdfpp._pycdfpp.CDF) list[str]
- property lazy_loaded
- property majority
- class pycdfpp.CompressionType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum- ahuff_compression = 3
- gzip_compression = 5
- huff_compression = 2
- no_compression = 0
- rle_compression = 1
- class pycdfpp.DataType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum- CDF_BYTE = 41
- CDF_CHAR = 51
- CDF_DOUBLE = 45
- CDF_EPOCH = 31
- CDF_EPOCH16 = 32
- CDF_FLOAT = 44
- CDF_INT1 = 1
- CDF_INT2 = 2
- CDF_INT4 = 4
- CDF_INT8 = 8
- CDF_NONE = 0
- CDF_REAL4 = 21
- CDF_REAL8 = 22
- CDF_TIME_TT2000 = 33
- CDF_UCHAR = 52
- CDF_UINT1 = 11
- CDF_UINT2 = 12
- CDF_UINT4 = 14
- class pycdfpp.Majority(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum- column = 0
- row = 1
- class pycdfpp.Variable
Bases:
pybind11_objectA CDF Variable (either R or Z variable)
- Attributes:
- attributes: dict
variable attributes
- name: str
variable name
- type: DataType
variable data type (ie CDF_DOUBLE, CDF_TIME_TT2000, …)
- shape: List[int]
variable shape (records + record shape)
- majority: cdf_majority
variable majority as writen in the CDF file, note that pycdfpp will always expose row major data.
- values_loaded: bool
True if values are availbale in memory, this is usefull with lazy loading to know if values are already loaded.
- compression: CompressionType
variable compression type (supported values are no_compression, rle_compression, gzip_compression)
- values: numpy.array
returns variable values as a numpy.array of the corresponding dtype and shape, note that no copies are involved, the returned array is just a view on variable data.
- values_encoded: numpy.array
same as values except that string variable are encoded wihch involves a data copy and since numpy uses UTF-32, expect a 4x memory increase for string values
Methods
add_attribute([name, values, data_type])Adds a new attribute to the variable.
set_values(values[, data_type, force])Sets or resets the values of the variable.
set_compression_type
Sets the variable compression type
- add_attribute(name=None, values=None, data_type=None) VariableAttribute
Adds a new attribute to the variable.
This method can be called in two ways: 1. With attribute parameters: add_attribute(name, values, data_type=None) 2. With a VariableAttribute object: add_attribute(attribute)
- Parameters:
- namestr
The name of the attribute to add.
- valuesnp.ndarray or List[float or int or datetime] or str
The values to set for the attribute. When a list is passed, the values are converted to a numpy.ndarray with the appropriate data type, with integers, it will choose the smallest data type that can hold all the values.
- data_typeDataType or None, optional
The data type of the attribute. If None, the data type is inferred from the values. (Default is None)
- attributeVariableAttribute
An existing VariableAttribute object to add to the variable (for the second calling method).
- Returns:
- VariableAttribute
Returns the newly created attribute if successful.
- Raises:
- ValueError
If the attribute already exists.
Examples
>>> from pycdfpp import CDF, DataType >>> import numpy as np >>> cdf = CDF() >>> cdf.add_variable("var1", np.arange(10, dtype=np.int32), DataType.CDF_INT4) var1: shape: [ 10 ] type: CDF_INT1 record varry: True compression: None ... >>> # First method: creating a new attribute with parameters >>> cdf["var1"].add_attribute("attr1", np.arange(10, dtype=np.int32), DataType.CDF_INT4) attr1: [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] >>> # Second method: adding an existing attribute >>> var2 = cdf.add_variable("var2", np.arange(5)) >>> var2.add_attribute(cdf["var1"].attributes["attr1"]) attr1: [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
- property attributes
- property compression
- property is_nrv
- property majority
- property name
- set_values(values, data_type=None, force=False)
Sets or resets the values of the variable.
- Parameters:
- valuesnumpy.ndarray or list or tuple or Variable
The values to set for the variable.
- data_typeDataType or None, optional
The data type of the variable. If None, the data type is inferred from the values. (Default is None) When passing integer values as a list or tuple, it will choose the smallest data type that can hold all the values. When passing a Variable, the data type is taken from the Variable.
- forcebool, optional
If True, allows to overwrite existing values even if the shape or data type do not match. (Default is False)
- Returns
- ——-
- None
- Raises
- ——
- ValueError
If the shape or data type do not match and force is False.
- Examples
- ——–
- >>> from pycdfpp import CDF, DataType
- >>> import numpy as np
- >>> cdf = CDF()
- >>> cdf.add_variable(“var1”)
- var1:
shape: [ ] type: CDF_NONE record vary: True compression: None
…
- >>> # Setting values with numpy array
- >>> cdf[“var1”].set_values(np.arange(10, 20, dtype=np.int32))
- >>> cdf[“var1”].values
- array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], dtype=int32)
- property shape
- property type
- property values
- property values_encoded
- property values_loaded
- pycdfpp.load(file_or_buffer: str, iso_8859_1_to_utf8: bool = True, lazy_load: bool = True)[source]
Load and parse a CDF file.
- Parameters:
- file_or_bufferstr or ByteString
Either a filename to be loaded or an in-memory file implementing the Python buffer protocol.
- iso_8859_1_to_utf8bool, optional
Automatically convert Latin-1 characters to their equivalent UTF counterparts when True. For CDF files prior to version 3.8, UTF-8 wasn’t supported and some CDF files might contain “illegal” Latin-1 characters. This option has no impact on valid UTF-8 characters. (Default is True)
- lazy_loadbool, optional
Controls whether variable values are loaded immediately or only when accessed by the user. If True, variables’ values are loaded on demand. If False, all variable values are loaded during parsing. (Default is True)
- Returns:
- CDF or None
Returns a CDF object upon successful read. If there’s an issue with the read, None is returned.
- pycdfpp.save(*args, **kwargs)
Overloaded function.
save(cdf: pycdfpp._pycdfpp.CDF, fname: str) -> bool
save(cdf: pycdfpp._pycdfpp.CDF) -> pycdfpp._pycdfpp._cdf_bytes
- pycdfpp.to_datetime(values)[source]
to_datetime
- Parameters:
- values: Variable or epoch or List[epoch] or epoch16 or List[epoch16] or tt2000_t or List[tt2000_t] or numpy.array[numpy.datetime64[ns]]
input value(s)
- to convert to datetime.datetime
- Returns:
- List[datetime.datetime]
- Raises:
- TypeError or IndexError
If the input values are not compatible time types.
- pycdfpp.to_datetime64(values)[source]
Convert any compatible given collection of time values to a numpy.datetime64 array.
- Parameters:
- values: Variable or epoch or List[epoch] or numpy.ndarray[epoch] or epoch16 or List[epoch16] or numpy.array[epoch16] or tt2000_t or List[tt2000_t] or numpy.array[tt2000_t]
input value(s)
- to convert to numpy.datetime64
- Returns:
- numpy.ndarray[numpy.datetime64]
- Raises:
- TypeError or IndexError
If the input values are not compatible time types.
- pycdfpp.to_time_string(values, format: str)[source]
Format CDF time values as an array of fixed-width ASCII strings.
- Parameters:
- valuesVariable or numpy.ndarray[tt2000_t] or numpy.ndarray[epoch] or numpy.ndarray[epoch16]
CDF time values to format.
- formatstr
strftime-compatible format string (e.g.
'%Y-%m-%dT%H:%M:%SZ').%Sautomatically includes sub-second digits matching the input precision (3 for epoch, 9 for tt2000, 12 for epoch16).
- Returns:
- numpy.ndarray
Array of byte strings (dtype
S{N}) with the same shape as input.