Skip to content

A more efficient way of reading MD trajectory #367

Open
@njzjz

Description

@njzjz

In the workflow, we do not need to read every frame of trajectory, but only what we want. So, we should firstly make the following dict to map the frame to the trajectory:

frames_dict = {
  Trajectory0: [23, 56, 78],
  Trajectory1: [22],
  ...
}

Then, reading each trajectory:

for traj, f_idx in frames_dict.items():
    traj.read(f_idx)

For a LAMMPS trajectory or other raw text files, the read should be

def read(self, f_idx: list[int]):
    with open(self.fname) as f:
        for ii, lines in enumerate(itertools.zip_longest(*[f] * self.nlines)):
            if ii not in f_idx:
                continue
            self.process_block(lines)

where nlines is the number of lines in each block, which should be determined in the very beginning. Usually, every frame has the same number of lines.

process_block method should convert a LAMMPS frame to dpdata.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions