Model flow

Models are built by multiple sub-modules. The sub-modules are independent with each other and the communications between them are realized by the set inferfaces. The type of inputs and outputs between modules is ‘dictionary’.

Every model structure can be abstracted as ‘feature extracting’ + ‘task branch’, where ‘feature extracting’ contains backbone and neck, and ‘task branch’ has different structures according to the tasks.

BackBone & Neck

  1. All Backbones and Necks inherit from class:torch.nn.Module, and need to realize the following interfaces.

  • __init__() When former network exists, the first parameter is the output channel number of it.

  • get_outplanes() When latter network exists, the function should be realized to return the output channel number for constructing the latter network.

  • forward() The input is the output of the dataset. The output format of the method is as followed.

{'features':[], 'strides': []}
  1. __init__() The parameters of function is mainly depended on configs. For example, we firstly use an simplest config custom_net.yaml to build a CNN with L layers.

Note

Ensure the defined type in configs to be imported. In this instance, we create a file ‘custom.py’ under ‘up.models.backbones’. You can use registry to register a new module and give it an alias, and then call the module by its alias.

net:
    name: backbone
    type: custom_net
    kwarg:
        depth: 3
        out_planes: [64, 128, 256]
import torch
import torch.nn as nn

class CustomNet(torch.nn.Module):
    def __init__(self, depth, out_planes):
        """
        Structural parameters come from the kwargs in the config.
        The instance has no inplances parameter since it has no precursor module.
        """
        self.out_planes = out_planes

        in_planes = 3
        for i in range(depth):
            self.add_module(f'layer{i}',
                            nn.Conv2d(in_planes, out_planes[i], kernel_size=3, padding=1))
            self.add_module('relu', nn.ReLU(inplace=True))
            in_planes = out_planes[i]

Then we realize forward() and get_outplanes()

Note

foward() function needs computing the output features and strides that are both array format.

def forward(self, input):
    """
    The type of input (dictionary) and the organization of data are mainly decided by Dataset in config.
    Here we assume that the input contains images.
    """

    x = input['image']

    for submodule in self.children():
        x = submodule(x)

    # The output is a dictionary which must contain features and strides, in the meanwhile we keep other data in the input.
    input['features'] = [x]
    input['strides'] = [1]

    return input

def get_outplanes(self):

    return self.out_planes

Note

The backbone can be called in ‘__init__.py’, and will be automatically registered to ‘MODULE_ZOO_REGISTRY’. The neck for detection and segmentation needs being registered to ‘MODULE_ZOO_REGISTRY’by @MODULE_ZOO_REGISTRY.register(“bias”)’.