PyCDFpp (PyCDF++)
A NASA’s CDF modern C++ library with Python bindings thanks to PyBind11. Why? CDF files are still used for space physics missions but few implementations are available. The main one is NASA’s C implementation available here but it lacks multi-threads support (global shared state), has an old C interface and has a license which isn’t compatible with most Linux distributions policy. There are also Java and Python implementations which are not usable in C++.
Quickstart
Reading CDF files
Basic example from a local file:
import pycdfpp
cdf = pycdfpp.load("some_cdf.cdf")
cdf_var_data = cdf["var_name"].values #builds a numpy view or a list of strings
attribute_name_first_value = cdf.attributes['attribute_name'][0]
Note that you can also load in memory files:
import pycdfpp
import requests
import matplotlib.pyplot as plt
tha_l2_fgm = pycdfpp.load(requests.get("https://spdf.gsfc.nasa.gov/pub/data/themis/tha/l2/fgm/2016/tha_l2_fgm_20160101_v01.cdf").content)
plt.plot(tha_l2_fgm["tha_fgl_gsm"])
plt.show()
Buffer protocol support:
import pycdfpp
import requests
import xarray as xr
import matplotlib.pyplot as plt
tha_l2_fgm = pycdfpp.load(requests.get("https://spdf.gsfc.nasa.gov/pub/data/themis/tha/l2/fgm/2016/tha_l2_fgm_20160101_v01.cdf").content)
xr.DataArray(tha_l2_fgm['tha_fgl_gsm'], dims=['time', 'components'], coords={'time':tha_l2_fgm['tha_fgl_time'].values, 'components':['x', 'y', 'z']}).plot.line(x='time')
plt.show()
# Works with matplotlib directly too
plt.plot(tha_l2_fgm['tha_fgl_time'], tha_l2_fgm['tha_fgl_gsm'])
plt.show()
Datetimes handling:
import pycdfpp
import os
# Due to an issue with pybind11 you have to force your timezone to UTC for
# datetime conversion (not necessary for numpy datetime64)
os.environ['TZ']='UTC'
mms2_fgm_srvy = pycdfpp.load("mms2_fgm_srvy_l2_20200201_v5.230.0.cdf")
# to convert any CDF variable holding any time type to python datetime:
epoch_dt = pycdfpp.to_datetime(mms2_fgm_srvy["Epoch"])
# same with numpy datetime64:
epoch_dt64 = pycdfpp.to_datetime64(mms2_fgm_srvy["Epoch"])
# note that using datetime64 is ~100x faster than datetime (~2ns/element on an average laptop)
Writing CDF files
Creating a basic CDF file:
import pycdfpp
import numpy as np
from datetime import datetime
cdf = pycdfpp.CDF()
cdf.add_attribute("some attribute", [[1,2,3], [datetime(2018,1,1), datetime(2018,1,2)], "hello\nworld"])
cdf.add_variable(f"some variable", values=np.ones((10),dtype=np.float64))
pycdfpp.save(cdf, "some_cdf.cdf")