Skip to content

Weird non-determinstic behavior on PyTorch imagenet #382

Open
@jma127

Description

@jma127

Repro

Environment

  • PyTorch master
  • CUDA 9.0
  • Driver 384.81
  • Ubuntu 16.04

Expected behavior

The two runs have the same output.

Actual behavior

The two runs have the same output when you run them one after the other (e.g. GPU 0 first, then Ctrl-C, then GPU 1). But when you run them at the same time, you get different output.

Suspicion

This is a driver bug. I dunno how PyTorch would be able to bypass CUDA_VISIBLE_DEVICES-based GPU segregation. But posting here for visibility anyways.

cc @shubho @ssnl @soumith @ailzhang

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions