Skip to content

Writing a RateMap to text (hapmap) #3252

@nspope

Description

@nspope

I frequently use msprime.RateMap and often need to write these out to hapmap format, e.g. for input to a standalone tool. I noticed that tskit-dev/msprime#1338 discusses adding RateMap.write_hapmap, but afaik this isn't in the works. Is there still interest in including a method along these lines? Something similar to,

def write_hapmap(ratemap: msprime.RateMap, contig_name: str) -> str:
    """
    Write a recombination rate map into hapmap format.
    """
    physical_position = ratemap.position.astype(np.int64)
    scaled_rate = ratemap.rate * 1e8
    map_position = ratemap.get_cumulative_mass(physical_position) * 100
    hapmap = ["Chromosome\tPosition(bp)\tRate(cM/Mb)\tMap(cM)"]
    if np.isnan(scaled_rate[-1]):  # handle trailing NaN
        scaled_rate[-1] = 0.0
    else:
        scaled_rate = np.append(scaled_rate, 0.0)
    for rate, pos, map in zip(scaled_rate, physical_position, map_position):
        if not np.isnan(rate): 
            hapmap.append(f"{contig_name}\t{pos}\t{rate:.10f}\t{map:.10f}")
    hapmap = "\n".join(hapmap) + "\n"
    return hapmap
 

# for tests
def assert_hapmap_equals_ratemap(ratemap: msprime.RateMap, hapmap_path: str, atol: float = 1e-12):
    """ 
    Check that hapmap file matches rate map.
    """
    hapmap_check = msprime.RateMap.read_hapmap(hapmap_path, sequence_length=ratemap.sequence_length, rate_col=2)
    np.testing.assert_allclose(hapmap_check.rate, ratemap.rate, atol=atol)
    np.testing.assert_allclose(hapmap_check.position, ratemap.position)
    hapmap_check = msprime.RateMap.read_hapmap(hapmap_path, sequence_length=ratemap.sequence_length, map_col=3)
    np.testing.assert_allclose(hapmap_check.rate, ratemap.rate, atol=atol)
    np.testing.assert_allclose(hapmap_check.position, ratemap.position) 

But perhaps with arguments controlling scaling of rates/map positions; precision of text output; and header -- so that non-recombination maps could be written out in meaningful units?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Python APIIssue is about the Python APIenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions