Linux

  1. chmod
    - chmod 400 file - Read by owner - chmod 040 file - Read by group - chmod 004 file - Read by world - chmod 200 file - Write by owner - chmod 020 file - Write by group - chmod 002 file - Write by world - chmod 100 file - execute by owner - chmod 010 file - execute by group - chmod 001 file - execute by world - chmod 444 file - Allow read permission to owner and group and world - chmod 777 file - Allow everyone to read, write, and execute file

ARCH Linux Setup

  1. Setup WIFI

    iwctl
    station wlan0 get-networks station wlan0 connect <Network name>
  2. Arch Install

    • GUI
      archinstall

    or

    • CLI

      Create partition

      cfdisk /dev/nvme0n1 800 M for EFI System > 20 GB for Linux filesystem ... for Linux swap

      Format

      mkfs.fat -F32 /dev/<EFI System> mkfs.ext4 /dev/<Linux filesystem> mkswap /dev/<swap>

      Mount

      #root mount /dev/<linux filesystem /mnt mkdir /mnt/boot mount /dev/<EFI system> /mnt/boot swapon /dev/<swap>

      Install

      pacstrap -i /mnt base base-devel linux-zen linux-firmware git sudo neofetch htop intel-ucode nano vim bluez bluez-utils networkmanager
      genfstab -U /mnt >> /mnt/etc/fstab cat /mnt/etc/fstab

      Enter the system

      arch-chroot /mnt # change root password passwd # create user useradd -m -g users -G wheel,storage,power,video,audio -s /bin/bash <username> passwd <username> EDITOR=vim visudo # uncomment line %wheel ALL=(ALL:ALL) ALL

      Timezone

      ln -sf /usr/share/zoneinfo/... /etc/localtime hwclock --systohc vim /etc/locale.gen #uncomment en_US ... locale-gen vim /etc/locale.conf # add "LANG=en_US.UTF-8

      Hostname

      vim /etc/hostname # add hostname vim /etc/hosts # add this line: 127.0.0.1 localhost ::1 localhost 127.0.1.1 <hostname>.localdomain <hostname>

      Bootloader

      pacman -S grub efibootmgr dosfstools mtools grub-install --traget=x86_64-efi --efi-directory=/boot --bootloader-id=GRUB grub-mkconfig -o /boot/grub/grub.cfg

      Finish

      systemctl enable bluetooth systemctl enable NetworkManager exit umount -lR /mnt

Unplug the USB drive and boot to the system

  1. Setup Enable radio wifi

    nmcli dev status nmcli radio wifi on nmcli dev wifi list sudo nmcli dev wifi connect <name> password "<password>" # update sudo pacman -Syu

    Install Desktop GUI

    sudo pacman -S xorg sddm plasma-meta plasma-workspace kde-applications sudo systemctl enable sddm sudo systemctl start sddm
  2. Fix Backend Fix Discover App

    sudo pacman -Sy flatpak

    Install Nvidia Driver

    lspci | grep -E "NVIDIA" sudo pacman -Sy nvidia
  3. Edit boot loader

    sudo pacman -Sy os-prober sudo vim /etc/default/grub # change following line # GRUB_TIMEOUT=20 # uncomment GRUB_DISABLE_OS_PROBER=false sudo grub-mkconfig -o /boot/grub/grub.cfg
  4. Chinese Character and Keyboard

    sudo pacman -S noto-fonts noto-fonts-cjk noto-fonts-extra noto-fonts-emoji ttf-dejavu ttf-liberation sudo pacman -S fcitx5-im fcitx5-rime cd ~/.local/share/fcitx5/rime git clone https://github.com/iDvel/rime-ice.git cp -r ./rime-ice/* .

SSH Configuration

  1. RSA

    • RSA keys have been the default for many years and are supported by almost all SSH clients and servers. They are well-understood and trusted in various computing environments. Many systems default to RSA key lengths of 2048 or 3072 bits, though some users prefer 4096 bits for enhanced security.
      ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
  2. Ed25519

    • Ed25519 is increasingly popular due to its strong security features and efficiency. It uses elliptic curve cryptography to provide excellent security with shorter keys, resulting in faster performance and less data usage during authentication. Many modern systems and security guidelines now recommend Ed25519 as the preferred choice for new key generation.
      ssh-keygen -t ed25519 -C "your_email@example.com"
  3. ECDSA

    • ECDSA is another commonly used type, particularly because it also offers good security with shorter key lengths compared to RSA. It's often used where there's a need for a balance between compatibility and modern cryptographic practices. ECDSA keys using the NIST P-256 curve (nistp256) are particularly common.
      ssh-keygen -t ecdsa -b 256 -C "your_email@example.com"
  • note: RSA and Ed25519 are generally the most recommended, with Ed25519 often preferred for new deployments due to its robustness and efficiency. RSA remains widely used due to its long history and broad support across older and legacy systems. For new systems or updates, transitioning to Ed25519 from RSA or ECDSA is a common recommendation for enhanced security and performance.
  1. Server Config

    • copy and paste the public keys to the authorized_keys file on the server.
      echo "paste-your-public-key-here" >> ~/.ssh/authorized_keys chmod 600 ~/.ssh/authorized_keys
  2. Local Config

    • create a config file in .ssh folder
      Host "custom name" HostName "hostname -after @" User "username" IdentityFile "private key location"
    • after configuration use the following command to connect to the server
      ssh "custom name"

Install Ansible

  1. Initializing
    • Create a folder inventory with a hosts file in it
      [server] # group name {ip address} {server name}
    • Try to ping the servers with password
      ansible -i ./inventory/hosts server -m ping --user sysadmin --ask-pass
    • Create a folder playboos and has a yaml file apt.yml in it
      - hosts: "*" become: tasks: - name: apt apt: update_cache: yes upgrade: 'yes'
    • Run the playbook
      ansible-playbook ./playbooks/apt.yml --user serveradmin --ask-pass --ask-become-pass -i ./inventory/hosts
    • Create a file qemu-get-agent.yml under playbooks to install a module
      - name: install latest qemu-guest-agent hosts: "*" tasks: - name: install qemu-guest-agent apt: name: qemu-guest-agent state: present update_cache: true become: true
    • Add the mattermost playbook
      --- - name: Install Mattermost Server hosts: all become: yes vars: mattermost_version: 5.31.0 mattermost_db_name: mattermost mattermost_db_user: mmuser mattermost_db_password: mmuser_password tasks: - name: Install necessary packages apt: name: "{{ item }}" state: present with_items: - git - nginx - postgresql - postgresql-contrib - name: Create Mattermost user user: name: mattermost state: present - name: Clone Mattermost server git: repo: 'https://github.com/mattermost/mattermost-server.git' dest: "/opt/mattermost-server" version: "v{{ mattermost_version }}" become: yes become_user: mattermost - name: Configure PostgreSQL block: - name: Create Mattermost database postgresql_db: name: "{{ mattermost_db_name }}" login_user: postgres - name: Create Mattermost database user postgresql_user: db: "{{ mattermost_db_name }}" name: "{{ mattermost_db_user }}" password: "{{ mattermost_db_password }}" priv: ALL login_user: postgres - name: Set up Mattermost configuration template: src: mattermost_config.json.j2 dest: "/opt/mattermost-server/config/config.json" owner: mattermost mode: '0644' - name: Start Mattermost service systemd: name: mattermost state: started enabled: yes

Virtual Machine with Vagrant

  1. Download and Install Tools

  2. Get the Linux Box from Vagrant Cloud

    • Visit Vagrant Cloud to find a suitable Linux box. Alternatively, you can add a Linux box directly using the command line:
      vagrant box add [box_name]
      Replace [box_name] with the name of the Linux box you want to use.
  3. Initialize Vagrant Environment

    • Initialize the VM with the following command:
      vagrant init [box_name]
      Again, replace [box_name] with the name of your chosen box.
  4. Start the Virtual Machine

    • Start the VM with:
      vagrant up
  5. Check Installed Linux Box Version

    • To check the installed Linux version and other boxes, use:
      vagrant box list
  6. Connect to VM

    • Connect to your VM via SSH using:
      vagrant ssh
  7. Disconnect to VM

    • suspend the VM
      vagrant suspend
    • resume from suspend
      vagrant resume
    • shutdown the VM
      vagrant halt

Add on features for Linux app

  1. NeoVim Setup

    Requirements:

    • Install Nerd font first
      wget https://github.com/ryanoasis/nerd-fonts/releases/download/v3.2.1/Hack.zip unzip Hack.zip mkdir -p ~/.local/share/fonts sudo cp Hack/*.ttf ~/.local/share/fonts/ fc-cache -fv
    • Install npm
      sudo apt install npm
    git clone https://github.com/Henryfzh/documentation.git
    `gcc` - Toggles the current line using linewise comment `gbc` - Toggles the current line using blockwise comment `[count]gcc` - Toggles the number of line given as a prefix-count using linewise `[count]gbc` - Toggles the number of line given as a prefix-count using blockwise `gc[count]{motion}` - (Op-pending) Toggles the region using linewise comment `gb[count]{motion}` - (Op-pending) Toggles the region using blockwise comment
  2. Theme

    Blur the windows:

    mutter-rounded mutter-rounded setting
  3. mdBook

    Requirements:

    • Install Rust
      cargo install mdbook
  4. tmux

    Install TPM:

    • Clone:
      git clone https://github.com/tmux-plugins/tpm ~/.tmux/plugins/tpm
    • Create ~/.tmux.conf, and add following to it:
      # List of plugins set -g @plugin 'tmux-plugins/tpm' set -g @plugin 'tmux-plugins/tmux-sensible' set -g @plugin 'catppuccin/tmux' set -g @catppuccin_flavour 'mocha' run '~/.tmux/plugins/tpm/tpm' set -g default-terminal 'tmux-256color' set -g mouse on # 2. Copy‑mode feels like Vim setw -g mode-keys vi # h‑j‑k‑l in copy‑mode bind -T copy-mode-vi v send -X begin-selection bind -T copy-mode-vi y send -X copy-selection-and-cancel # 3. Quick pane movement with Ctrl‑h/j/k/l (same as Vim splits) bind -n C-h select-pane -L bind -n C-j select-pane -D bind -n C-k select-pane -U bind -n C-l select-pane -R # 4. Resize panes with Alt + arrows bind -n M-Left resize-pane -L 5 bind -n M-Right resize-pane -R 5 bind -n M-Up resize-pane -U 2 bind -n M-Down resize-pane -D 2 # 5. Fast config reload: <prefix> r bind r source-file ~/.tmux.conf \; display-message "✔ tmux.conf reloaded" # 6. Shorter escape delay (makes Vim feel snappier inside tmux) set -s escape-time 0
    • Install
      ctrl + B, I
    • Reload:
      tmux source ~/.tmux.conf
  5. zsh fuzzy finder

    fzf

    .zshrc plugins

    powerlevel10k
    copypath copyfile copybuffer

    flatpak zsh on VSCode, add the lines to settings.json on VSCode

    "terminal.integrated.defaultProfile.linux": "bash", "terminal.integrated.profiles.linux": { "bash": { "path": "/usr/bin/flatpak-spawn", "overrideName": true, "args": ["--host", "--env=TERM=xterm-256color", "zsh"] } },

Docker Basics

  1. Build

    sudo docker -t <target-name> -f Dockerfile .
  2. Run

    • With bash
      sudo docker run --rm -it --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined <target-name> /bin/bash
    • Without bash
      sudo docker run --rm --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined rochpl.6.0 mpirun_rochpl -P 1 -Q 1 -N 45312
    • Mount a Directory
      sudo docker run --rm -it --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined --network host --name rochpl_node -v <directory>:/opt rochpl /usr/sbin/sshd -D
  3. Update Docker

    • commit the changes
      docker ps # to get the container id
      docker commit <containerID> <imageid>
  4. Clean

    # remove all images sudo docker system prune -a sudo docker container prune sudo docker buildx prune -f

High Performance Linpack

  1. Install OpenMPI
    • Download and unzip
      wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.6.tar.gz tar xzf openmpi-5.0.6.tar.gz cd openmpi-5.0.6/
    • Compile
      Preferably installed at "/usr/local/" but require sudo access
      ./configure --prefix=<OPENMPI_INSTALL_DIRECTORY> make make install
  2. Install OpenBLAS
    • Download and unzip
      wget https://github.com/OpenMathLib/OpenBLAS/releases/download/v0.3.28/OpenBLAS-0.3.28.tar.gz tar xzf OpenBLAS-0.3.28.tar.gz cd OpenBLAS-0.3.28/
    • Compile
      Preferably installed at "/usr/local/" but require sudo access
      make make PREFIX=<OPEN_BLAS_INSTALL_DIRECTORY> install
  3. Update Path
    • Update path to OpenMPI and OpenBLAS in .bashrc or .zshrc
      export PATH=<OPENMPI_INSTALL_DIRECTORY>/bin:$PATH export LD_LIBRARY_PATH=<OPENMPI_INSTALL_DIRECTORY>/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=<OPEN_BLAS_INSTALL_DIRECTORY>/lib:$LD_LIBRARY_PATH
      source ~/.bashrc
      or
      source ~/.zshrc
  4. Download HPL
    • Using wget or curl download from official website:
      wget http://www.netlib.org/benchmark/hpl/hpl-2.3.tar.gz
      or
      curl -O http://www.netlib.org/benchmark/hpl/hpl-2.3.tar.gz
    • Unzip
      tar -xf hpl-2.3.tar.gz

    1. Compile with CPU

    2. Compile with AMD GPU

High Performance Linpack with CPU

  1. Compile HPL
    • Copy the template Makefile:
      cp setup/Make.Linux_Intel64 Make.Linux_Intel64
    • Edit the make file and change following lines:
      TOPdir = <hpl-2.3 top folder directory> MPdir = <openmpi file directory> MPinc = -I$(MPdir)/include MPlib = -L$(MPdir)/lib -lmpi LAdir = <openblas file directory> LAinc = -I$(LAdir)/include LAlib = $(LAdir)/lib/libopenblas.a CC = mpicc CCNOOPT = $(HPL_DEFS) CCFLAGS = $(HPL_DEFS) -O3 -w -z noexecstack -z relro -z now -Wall # modify this according to the cpu LINKFLAGS = $(CCFLAGS) $(OMP_DEFS)
    • Compile
      make arch=Linux_Intel64
    • If you want to clean:
      make clean arch=Linux_Intel64
  2. Run HPL
    • Edit the file bin/Linux_Intel64/HPL.dat inside the top folder.
      Here is an example with 8GB RAM and 4 Cores CPU:

      HPLinpack benchmark input file Innovative Computing Laboratory, University of Tennessee HPL.out output file name (if any) 6 device out (6=stdout,7=stderr,file) 1 # of problems sizes (N) 29184 Ns 1 # of NBs 192 NBs 0 PMAP process mapping (0=Row-,1=Column-major) 1 # of process grids (P x Q) 2 Ps 2 Qs 16.0 threshold 1 # of panel fact 2 PFACTs (0=left, 1=Crout, 2=Right) 1 # of recursive stopping criterium 4 NBMINs (>= 1) 1 # of panels in recursion 2 NDIVs 1 # of recursive panel fact. 1 RFACTs (0=left, 1=Crout, 2=Right) 1 # of broadcast 1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM) 1 # of lookahead depth 1 DEPTHs (>=0) 2 SWAP (0=bin-exch,1=long,2=mix) 64 swapping threshold 0 L1 in (0=transposed,1=no-transposed) form 0 U in (0=transposed,1=no-transposed) form 1 Equilibration (0=no,1=yes) 8 memory alignment in double (> 0) ##### This line (no. 32) is ignored (it serves as a separator). ###### 0 Number of additional problem sizes for PTRANS 1200 10000 30000 values of N 0 number of additional blocking sizes for PTRANS 40 9 8 13 13 20 16 32 64 values of NB
    • To tune the parameters, can reference the website here. It is not guaranteed to be the optimized setup. Try to tune the parameter by yourself.

      The following parameters are probably you need to tune:

      • Ps * Qs: the number of cores
      • Ns: the problem size
      • NBs: the block size
    • Run benchmark

      mpirun -np <number of cores> ./xhpl

High Performance Linpack with AMD GPU

Prepare (Download the Dockerfile)

Dockerfile for AMD GPU

1. Build Dockerfile

sudo docker build -t rochpl -f Dockerfile .

2. Setup Docker image

  • Node A
    docker save -o rochpl_image.tar rochpl scp rochpl_image.tar user@10.0.0.12:~
  • Node B
    docker load -i ~/rochpl_image.tar
  • Both nodes
    sudo docker run --rm -it \ --device /dev/kfd \ --device /dev/dri \ --security-opt seccomp=unconfined \ --network=host \ --name=rochpl_node \ rochpl /bin/bash
  • Setup SSH keys
    # Both Nodes ssh-keygen -t rsa -f ~/.ssh/id_rsa -q -N ""
    # Both Nodes vim /etc/ssh/sshd_config # change the line --- PasswordAuthentication yes # add this line --- PermitRootLogin yes
    # Node A ssh-copy-id -p 2222 root@10.0.0.12
    # Node B ssh-copy-id -p 2222 root@10.0.0.14
  • Add following to both nodes
    vim ~/.ssh/config
    Host 10.0.0.14 Port 2222 User root Host 10.0.0.12 Port 2222 User root
    • Test if it works
      ssh 10.0.0.14 hostname

3. Run HPL

  • Add the rochpl_hostfile on both node
    10.0.0.14 slots=4 10.0.0.12 slots=4
  • Run HPL using this command (modify the arguments to suit your environment)
    export OMPI_MCA_pmix=pmix
    mpirun --hostfile rochpl_hostfile -np 8 --bind-to none -x HIP_VISIBLE_DEVICES=0,1,2,3 --mca pml ucx --mca btl ^vader,tcp,openib,uct ./run_rochpl -P 2 -Q 4 -N 256000 --NB 512
ARG UBUNTU_VERSION="jammy" FROM ubuntu:${UBUNTU_VERSION} ARG ROCM_URL="https://repo.radeon.com/amdgpu-install/6.1.1/ubuntu/jammy/amdgpu-install_6.1.60101-1_all.deb" ARG UCX_BRANCH="v1.16.0" ARG UCC_BRANCH="v1.3.0" ARG OMPI_BRANCH="v5.0.3" ARG APT_GET_APPS="" ARG GPU_TARGET="gfx908,gfx90a,gfx942" # Update and Install basic Linux development tools RUN apt-get update \ && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ ca-certificates \ git \ ssh \ openssh-client \ openssh-server \ make \ vim \ nano \ libtinfo-dev\ initramfs-tools \ libelf-dev \ numactl \ curl \ wget \ tmux \ build-essential \ autoconf \ automake \ libtool \ pkg-config \ libnuma-dev \ gfortran \ flex \ hwloc \ libstdc++-12-dev \ libxml2-dev \ python3-dev \ python3-pip \ python3-distutils \ unzip ${APT_GET_APPS}\ && apt-get clean RUN wget -qO- https://repo.radeon.com/rocm/rocm.gpg.key | gpg --dearmor | tee /etc/apt/trusted.gpg.d/rocm.gpg \ && wget -O rocm.deb ${ROCM_URL} \ && apt install -y ./rocm.deb \ && amdgpu-install --usecase=rocm,hiplibsdk --no-dkms -y RUN bash -c """IFS=',' read -r -a ARCH <<<${GPU_TARGET} \ && for gpu_arch in \${ARCH[@]}; do \ echo \$gpu_arch >> /opt/rocm/bin/target.lst; \ done""" \ && chmod a+r /opt/rocm/bin/target.lst # # Requires cmake > 3.22 RUN mkdir -p /opt/cmake \ && wget --no-check-certificate --quiet -O - https://cmake.org/files/v3.27/cmake-3.27.7-linux-x86_64.tar.gz | tar --strip-components=1 -xz -C /opt/cmake ENV ROCM_PATH=/opt/rocm \ UCX_PATH=/opt/ucx \ UCC_PATH=/opt/ucc \ OMPI_PATH=/opt/ompi \ GPU_TARGET=${GPU_TARGET} # Adding rocm/cmake to the Environment ENV PATH=$ROCM_PATH/bin:/opt/cmake/bin:$PATH \ LD_LIBRARY_PATH=$ROCM_PATH/lib:$ROCM_PATH/lib64:$ROCM_PATH/llvm/lib:$LD_LIBRARY_PATH \ LIBRARY_PATH=$ROCM_PATH/lib:$ROCM_PATH/lib64:$LIBRARY_PATH \ C_INCLUDE_PATH=$ROCM_PATH/include:$C_INCLUDE_PATH \ CPLUS_INCLUDE_PATH=$ROCM_PATH/include:$CPLUS_INCLUDE_PATH \ CMAKE_PREFIX_PATH=$ROCM_PATH/lib/cmake:$CMAKE_PREFIX_PATH # Create the necessary directory for SSH RUN mkdir /var/run/sshd # Set root password for login RUN echo 'root:redhat' | chpasswd # Allow root login and password authentication RUN sed -i 's/#PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config && \ sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \ echo "StrictModes no" >> /etc/ssh/sshd_config # Change the SSH port to 2222 RUN sed -i 's/#Port 22/Port 2222/' /etc/ssh/sshd_config # Expose the new SSH port EXPOSE 2222 # Start the SSH service and keep the container running ENTRYPOINT service ssh restart && bash WORKDIR /tmp # Install UCX RUN git clone https://github.com/openucx/ucx.git -b ${UCX_BRANCH} \ && cd ucx \ && ./autogen.sh \ && mkdir build \ && cd build \ && ../contrib/configure-release --prefix=$UCX_PATH \ --with-rocm=$ROCM_PATH \ --without-knem \ --without-xpmem \ --without-cuda \ --enable-optimizations \ --disable-logging \ --disable-debug \ --disable-examples \ && make -j $(nproc) \ && make install # Install UCC RUN git clone -b ${UCC_BRANCH} https://github.com/openucx/ucc \ && cd ucc \ && ./autogen.sh \ && sed -i 's/memoryType/type/g' ./src/components/mc/rocm/mc_rocm.c \ # offload-arch=native builds the local architecutre, which may not be present at build time for a container. && sed -i 's/--offload-arch=native//g' ./cuda_lt.sh \ && mkdir build \ && cd build \ && ../configure --prefix=${UCC_PATH} --with-rocm=${ROCM_PATH} --with-ucx=${UCX_PATH} --with-rccl=no \ && make -j $(nproc) \ && make install # Install OpenMPI RUN git clone --recursive https://github.com/open-mpi/ompi.git -b ${OMPI_BRANCH} \ && cd ompi \ && ./autogen.pl \ && mkdir build \ && cd build \ && ../configure --prefix=$OMPI_PATH --with-ucx=$UCX_PATH \ --with-ucc=${UCC_PATH} \ --enable-mca-no-build=btl-uct \ --without-verbs \ --with-pmix=internal \ --enable-mpi \ --enable-mpi-fortran=yes \ --disable-man-pages \ --disable-debug \ && make -j $(nproc) \ && make install # Adding OpenMPI, UCX, and UCC to Environment ENV PATH=$OMPI_PATH/bin:$UCX_PATH/bin:$UCC_PATH/bin:$PATH \ LD_LIBRARY_PATH=$OMPI_PATH/lib:$UCX_PATH/lib:$UCC_PATH/lib:$LD_LIBRARY_PATH \ LIBRARY_PATH=$OMPI_PATH/lib:$UCX_PATH/lib:$UCC_PATH/lib:$LIBRARY_PATH \ C_INCLUDE_PATH=$OMPI_PATH/include:$UCX_PATH/include:$UCC_PATH/include:$C_INCLUDE_PATH \ CPLUS_INCLUDE_PATH=$OMPI_PATH/include:$UCX_PATH/include:$UCC_PATH/include:$CPLUS_INCLUDE_PATH \ PKG_CONFIG_PATH=$OMPI_PATH/lib/pkgconfig:$UCX_PATH/lib/pkgconfig/:$PKG_CONFIG_PATH \ OMPI_ALLOW_RUN_AS_ROOT=1 \ OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 \ UCX_WARN_UNUSED_ENV_VARS=n # Install Additional Apps Below ARG HPL_BRANCH="main" WORKDIR /opt # Installing rocHPL RUN git clone -b ${HPL_BRANCH} https://github.com/ROCmSoftwarePlatform/rocHPL.git \ && cd rocHPL \ && ./install.sh \ --prefix=/opt/rochpl \ --with-rocm=/opt/rocm/ \ --with-mpi=/opt/ompi \ && rm -rf /tmp/rocHPL ENV PATH=$PATH:/opt/rochpl:/opt/rochpl/bin ENV HIP_VISIBLE_DEVICES=0,1,2,3 #CMD ["/usr/sbin/sshd", "-D"] CMD ["/bin/bash"]

Machine Learning

Computer Vision

  1. VGG16

    • VGG16 has a total of 138 million parameters. The important point to note here is that all the conv kernels are of size 3x3 and maxpool kernels are of size 2x2 with a stride of two. VGG16 Architecture
  2. ResNet

    • Resnet18 has around 11 million trainable parameters. It consists of CONV layers with filters of size 3x3 (just like VGGNet). Only two pooling layers are used throughout the network one at the beginning and the other at the end of the network. Identity connections are between every two CONV layers. The solid arrows show identity shortcuts where the dimension of the input and output is the same, while the dotted ones present the projection connections where the dimensions differ. ResNet Architecture
  3. Architecture Differences:

    • VGG16: VGG16 is a deep convolutional network with a straightforward and uniform architecture, consisting of 16 layers with very small (3x3) convolution filters. It is known for its simplicity and has been a popular choice for image classification tasks.
    • ResNet: ResNet, particularly ResNet-50, uses residual connections that help mitigate the vanishing gradient problem, allowing for the training of much deeper networks. ResNet architectures are typically deeper and more complex than VGG16, which generally results in better feature extraction and higher accuracy in many tasks.
  4. Performance:

    • Accuracy: ResNet models, due to their depth and residual connections, generally outperform VGG16 in many image recognition tasks, including object detection. They are able to learn more complex features and provide better accuracy.
    • Computation and Memory: ResNet models are usually more computationally expensive and require more memory compared to VGG16. This can be a consideration if you have limited computational resources.
  5. Application in Object Detection:

    • Object detection frameworks such as Faster R-CNN, SSD, and YOLO have utilized both VGG and ResNet as backbone feature extractors. In many cases, ResNet-based models have shown better performance in terms of both precision and recall.
    • For instance, Faster R-CNN with a ResNet-50 or ResNet-101 backbone generally performs better than the same framework with a VGG16 backbone.

Practical Considerations:

  • ResNet Advantages:

    • Better accuracy and feature representation due to deeper network architecture.
    • Residual connections help in training deeper networks, resulting in improved performance.
  • VGG16 Advantages:

    • Simpler architecture which can be easier to implement and train.
    • Less computationally intensive compared to ResNet.

Conclusion:

In general, ResNet models tend to be better than VGG16 for object detection tasks due to their superior feature extraction capabilities and higher accuracy. However, this comes at the cost of increased computational requirements.

If computational resources are not a constraint, it is recommended to use ResNet (e.g., ResNet-50 or ResNet-101) for better performance in object detection. However, if you need a simpler and less resource-intensive model, VGG16 is still a viable option and can achieve good results.