Review of plot_column_values() in draw.py

  • No paths hardcoded, make them a variable (e.g.1). We would like to be able to run this code for any file in any folder.

  • Include docstrings and specify types. Explain the purpose of each parameter. Here it's not clear to me what keep_test_mouse is for.

Example:

def add(a: int, b: int) -> int:
    """Add two integers.

    Args:
        a (int): First integer.
        b (int): Second integer.

    Returns:
        int: The sum of `a` and `b`.
    """
  • Here 'nothing' should become a global variable, you can include it before all the functions after the import section:
FILLNA_BEHAVIOR = 'nothing'
..
phase_data['manual_annot'] = phase_data['manual_annot'].fillna(FILLNA_BEHAVIOR)
  • If I got this correctly, this function does 2 things: 1) fillna in the manual annotation column 2) plots x and y position over time. Can we split this function into 2 functions? And can we use 'draw.py' only for the plotting functions and create another file for data cleaning / data preprocessing? It would be cool to start thinking of a "pipeline"

  • Again on this: it's not clear to me why we need to create subset of the whole dataset (e.g. data = data[cols] and phase_data = data[data['phase'] == phase]. Can you avoid it and just select the columns you need?

  • I would review the other functions below in terms of:

  1. docstrings, types and description of variables and function
  2. keep function short and constrained to one single step, when possible
  3. avoid hardcoding parameters