Linux

chmod

- chmod 400 file - Read by owner
- chmod 040 file - Read by group
- chmod 004 file - Read by world

- chmod 200 file - Write by owner
- chmod 020 file - Write by group
- chmod 002 file - Write by world

- chmod 100 file - execute by owner
- chmod 010 file - execute by group
- chmod 001 file - execute by world

- chmod 444 file - Allow read permission to owner and group and world
- chmod 777 file - Allow everyone to read, write, and execute file

ARCH Linux Setup

Setup WIFI

iwctl

station wlan0 get-networks
station wlan0 connect <Network name>

Arch Install

GUI
```
archinstall
```

CLI

Create partition

cfdisk /dev/nvme0n1

800 M for EFI System
> 20 GB for Linux filesystem
... for Linux swap

Format

mkfs.fat -F32 /dev/<EFI System>
mkfs.ext4 /dev/<Linux filesystem>
mkswap /dev/<swap>

Mount

#root
mount /dev/<linux filesystem /mnt
mkdir /mnt/boot
mount /dev/<EFI system> /mnt/boot
swapon /dev/<swap>

Install

pacstrap -i /mnt base base-devel linux-zen linux-firmware git sudo neofetch htop intel-ucode nano vim bluez bluez-utils networkmanager

genfstab -U /mnt >> /mnt/etc/fstab
cat /mnt/etc/fstab

Enter the system

arch-chroot /mnt

# change root password
passwd

# create user
useradd -m -g users -G wheel,storage,power,video,audio -s /bin/bash <username>
passwd <username>

EDITOR=vim visudo
# uncomment line %wheel ALL=(ALL:ALL) ALL

Timezone

ln -sf /usr/share/zoneinfo/... /etc/localtime
hwclock --systohc
vim /etc/locale.gen #uncomment en_US ...
locale-gen
vim /etc/locale.conf # add "LANG=en_US.UTF-8

Hostname

vim /etc/hostname # add hostname

vim /etc/hosts
# add this line:
127.0.0.1   localhost
::1         localhost
127.0.1.1    <hostname>.localdomain   <hostname>

Bootloader

pacman -S grub efibootmgr dosfstools mtools

grub-install --traget=x86_64-efi --efi-directory=/boot --bootloader-id=GRUB
grub-mkconfig -o /boot/grub/grub.cfg

Finish

systemctl enable bluetooth
systemctl enable NetworkManager
exit
umount -lR /mnt

Unplug the USB drive and boot to the system

Setup Enable radio wifi

nmcli dev status
nmcli radio wifi on
nmcli dev wifi list
sudo nmcli dev wifi connect <name> password "<password>"

# update
sudo pacman -Syu

Install Desktop GUI

sudo pacman -S xorg sddm plasma-meta plasma-workspace kde-applications

sudo systemctl enable sddm
sudo systemctl start sddm

Fix Backend Fix Discover App

sudo pacman -Sy flatpak

Install Nvidia Driver

lspci | grep -E "NVIDIA"

sudo pacman -Sy nvidia

Edit boot loader

sudo pacman -Sy os-prober
sudo vim /etc/default/grub
# change following line
# GRUB_TIMEOUT=20
# uncomment GRUB_DISABLE_OS_PROBER=false

sudo grub-mkconfig -o /boot/grub/grub.cfg

Chinese Character and Keyboard

sudo pacman -S noto-fonts noto-fonts-cjk noto-fonts-extra noto-fonts-emoji ttf-dejavu ttf-liberation
sudo pacman -S fcitx5-im fcitx5-rime

cd ~/.local/share/fcitx5/rime
git clone https://github.com/iDvel/rime-ice.git
cp -r ./rime-ice/* .

SSH Configuration

RSA
- RSA keys have been the default for many years and are supported by almost all SSH clients and servers. They are well-understood and trusted in various computing environments. Many systems default to RSA key lengths of 2048 or 3072 bits, though some users prefer 4096 bits for enhanced security.
```
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
```
Ed25519
- Ed25519 is increasingly popular due to its strong security features and efficiency. It uses elliptic curve cryptography to provide excellent security with shorter keys, resulting in faster performance and less data usage during authentication. Many modern systems and security guidelines now recommend Ed25519 as the preferred choice for new key generation.
```
ssh-keygen -t ed25519 -C "your_email@example.com"
```
ECDSA
- ECDSA is another commonly used type, particularly because it also offers good security with shorter key lengths compared to RSA. It's often used where there's a need for a balance between compatibility and modern cryptographic practices. ECDSA keys using the NIST P-256 curve (nistp256) are particularly common.
```
ssh-keygen -t ecdsa -b 256 -C "your_email@example.com"
```

note: RSA and Ed25519 are generally the most recommended, with Ed25519 often preferred for new deployments due to its robustness and efficiency. RSA remains widely used due to its long history and broad support across older and legacy systems. For new systems or updates, transitioning to Ed25519 from RSA or ECDSA is a common recommendation for enhanced security and performance.

Server Config

copy and paste the public keys to the authorized_keys file on the server.

echo "paste-your-public-key-here" >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

Local Config

create a config file in .ssh folder

Host "custom name"
HostName "hostname -after @"
User "username"
IdentityFile "private key location"

after configuration use the following command to connect to the server
```
ssh "custom name"
```

Install Ansible

Initializing

Create a folder inventory with a hosts file in it

[server] # group name
{ip address}
{server name}

Try to ping the servers with password

ansible -i ./inventory/hosts server -m ping --user sysadmin --ask-pass

Create a folder playboos and has a yaml file apt.yml in it

- hosts: "*"
  become:
  tasks:
    - name: apt
      apt:
        update_cache: yes
        upgrade: 'yes'

Run the playbook

ansible-playbook ./playbooks/apt.yml --user serveradmin --ask-pass --ask-become-pass -i ./inventory/hosts

Create a file qemu-get-agent.yml under playbooks to install a module

- name: install latest qemu-guest-agent
  hosts: "*"
  tasks:
    - name: install qemu-guest-agent
      apt:
        name: qemu-guest-agent
        state: present
        update_cache: true
      become: true

Add the mattermost playbook

---
- name: Install Mattermost Server
  hosts: all
  become: yes
  vars:
    mattermost_version: 5.31.0
    mattermost_db_name: mattermost
    mattermost_db_user: mmuser
    mattermost_db_password: mmuser_password

  tasks:
    - name: Install necessary packages
      apt:
        name: "{{ item }}"
        state: present
      with_items:
        - git
        - nginx
        - postgresql
        - postgresql-contrib

    - name: Create Mattermost user
      user:
        name: mattermost
        state: present

    - name: Clone Mattermost server
      git:
        repo: 'https://github.com/mattermost/mattermost-server.git'
        dest: "/opt/mattermost-server"
        version: "v{{ mattermost_version }}"
      become: yes
      become_user: mattermost

    - name: Configure PostgreSQL
      block:
        - name: Create Mattermost database
          postgresql_db:
            name: "{{ mattermost_db_name }}"
            login_user: postgres

        - name: Create Mattermost database user
          postgresql_user:
            db: "{{ mattermost_db_name }}"
            name: "{{ mattermost_db_user }}"
            password: "{{ mattermost_db_password }}"
            priv: ALL
            login_user: postgres

    - name: Set up Mattermost configuration
      template:
        src: mattermost_config.json.j2
        dest: "/opt/mattermost-server/config/config.json"
        owner: mattermost
        mode: '0644'

    - name: Start Mattermost service
      systemd:
        name: mattermost
        state: started
        enabled: yes

Virtual Machine with Vagrant

Download and Install Tools
- Download and install VirtualBox from the Official VirtualBox website.
- Download and install Vagrant from the Official Vagrant website.
Get the Linux Box from Vagrant Cloud
- Visit Vagrant Cloud to find a suitable Linux box. Alternatively, you can add a Linux box directly using the command line:
```
vagrant box add [box_name]
```
  Replace [box_name] with the name of the Linux box you want to use.
Initialize Vagrant Environment
- Initialize the VM with the following command:
```
vagrant init [box_name]
```
  Again, replace [box_name] with the name of your chosen box.
Start the Virtual Machine
- Start the VM with:
```
vagrant up
```
Check Installed Linux Box Version
- To check the installed Linux version and other boxes, use:
```
vagrant box list
```
Connect to VM
- Connect to your VM via SSH using:
```
vagrant ssh
```
Disconnect to VM
- suspend the VM
```
vagrant suspend
```
- resume from suspend
```
vagrant resume
```
- shutdown the VM
```
vagrant halt
```

Add on features for Linux app

NeoVim Setup

Requirements:

Install Nerd font first

wget https://github.com/ryanoasis/nerd-fonts/releases/download/v3.2.1/Hack.zip
unzip Hack.zip 
mkdir -p ~/.local/share/fonts
sudo cp Hack/*.ttf ~/.local/share/fonts/
fc-cache -fv

Install npm
```
sudo apt install npm
```

git clone https://github.com/Henryfzh/documentation.git

`gcc` - Toggles the current line using linewise comment
`gbc` - Toggles the current line using blockwise comment
`[count]gcc` - Toggles the number of line given as a prefix-count using linewise
`[count]gbc` - Toggles the number of line given as a prefix-count using blockwise
`gc[count]{motion}` - (Op-pending) Toggles the region using linewise comment
`gb[count]{motion}` - (Op-pending) Toggles the region using blockwise comment

Theme

Blur the windows:
```
mutter-rounded
mutter-rounded setting
```
mdBook

Requirements:
- Install Rust
```
cargo install mdbook
```

tmux

Install TPM:

Clone:

git clone https://github.com/tmux-plugins/tpm ~/.tmux/plugins/tpm

Create ~/.tmux.conf, and add following to it:

# List of plugins
set -g @plugin 'tmux-plugins/tpm'
set -g @plugin 'tmux-plugins/tmux-sensible'

set -g @plugin 'catppuccin/tmux'
set -g @catppuccin_flavour 'mocha'

run '~/.tmux/plugins/tpm/tpm'

set -g default-terminal 'tmux-256color'

set -g mouse on

# 2. Copy‑mode feels like Vim
setw -g mode-keys vi                # h‑j‑k‑l in copy‑mode
bind -T copy-mode-vi v  send -X begin-selection
bind -T copy-mode-vi y  send -X copy-selection-and-cancel

# 3. Quick pane movement with Ctrl‑h/j/k/l (same as Vim splits)
bind -n C-h select-pane -L
bind -n C-j select-pane -D
bind -n C-k select-pane -U
bind -n C-l select-pane -R

# 4. Resize panes with Alt + arrows
bind -n M-Left  resize-pane -L 5
bind -n M-Right resize-pane -R 5
bind -n M-Up    resize-pane -U 2
bind -n M-Down  resize-pane -D 2

# 5. Fast config reload:  <prefix> r
bind r source-file ~/.tmux.conf \; display-message "✔ tmux.conf reloaded"

# 6. Shorter escape delay (makes Vim feel snappier inside tmux)
set -s escape-time 0

Install
```
ctrl + B, I
```
Reload:
```
tmux source ~/.tmux.conf
```

zsh fuzzy finder

fzf

.zshrc plugins

powerlevel10k

copypath copyfile copybuffer

flatpak zsh on VSCode, add the lines to settings.json on VSCode

"terminal.integrated.defaultProfile.linux": "bash",
 "terminal.integrated.profiles.linux": {
   "bash": {
     "path": "/usr/bin/flatpak-spawn",
     "overrideName": true,
     "args": ["--host", "--env=TERM=xterm-256color", "zsh"]
   }
 },

Docker Basics

Build

sudo docker -t <target-name> -f Dockerfile .

Run

With bash

sudo docker run --rm -it --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined <target-name> /bin/bash

Without bash

sudo docker run --rm --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined rochpl.6.0  mpirun_rochpl -P 1 -Q 1 -N 45312

Mount a Directory

sudo docker run --rm -it --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined --network host --name rochpl_node -v <directory>:/opt rochpl /usr/sbin/sshd -D

Update Docker

commit the changes

docker ps # to get the container id

docker commit <containerID> <imageid>

Clean

# remove all images
sudo docker system prune -a 
sudo docker container prune
sudo docker buildx prune -f

High Performance Linpack

Install OpenMPI

Download and unzip

wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.6.tar.gz
tar xzf openmpi-5.0.6.tar.gz
cd openmpi-5.0.6/

Compile
Preferably installed at "/usr/local/" but require sudo access

./configure --prefix=<OPENMPI_INSTALL_DIRECTORY>
make
make install

Install OpenBLAS

Download and unzip

wget https://github.com/OpenMathLib/OpenBLAS/releases/download/v0.3.28/OpenBLAS-0.3.28.tar.gz
tar xzf OpenBLAS-0.3.28.tar.gz
cd OpenBLAS-0.3.28/

Compile
Preferably installed at "/usr/local/" but require sudo access
```
make
make PREFIX=<OPEN_BLAS_INSTALL_DIRECTORY> install
```

Update Path

Update path to OpenMPI and OpenBLAS in .bashrc or .zshrc

export PATH=<OPENMPI_INSTALL_DIRECTORY>/bin:$PATH
export LD_LIBRARY_PATH=<OPENMPI_INSTALL_DIRECTORY>/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=<OPEN_BLAS_INSTALL_DIRECTORY>/lib:$LD_LIBRARY_PATH

source ~/.bashrc

source ~/.zshrc

Download HPL

Using wget or curl download from official website:

wget http://www.netlib.org/benchmark/hpl/hpl-2.3.tar.gz

curl -O http://www.netlib.org/benchmark/hpl/hpl-2.3.tar.gz

Unzip
```
tar -xf hpl-2.3.tar.gz
```

High Performance Linpack with CPU

Compile HPL

Copy the template Makefile:

cp setup/Make.Linux_Intel64 Make.Linux_Intel64

Edit the make file and change following lines:

TOPdir = <hpl-2.3 top folder directory>

MPdir =  <openmpi file directory>
MPinc = -I$(MPdir)/include
MPlib = -L$(MPdir)/lib -lmpi

LAdir = <openblas file directory>
LAinc = -I$(LAdir)/include
LAlib = $(LAdir)/lib/libopenblas.a

CC = mpicc
CCNOOPT = $(HPL_DEFS)
CCFLAGS = $(HPL_DEFS) -O3 -w -z noexecstack -z relro -z now -Wall # modify this according to the cpu

LINKFLAGS = $(CCFLAGS) $(OMP_DEFS)

Compile
```
make arch=Linux_Intel64
```
If you want to clean:
```
make clean arch=Linux_Intel64
```

Run HPL

Edit the file bin/Linux_Intel64/HPL.dat inside the top folder.
Here is an example with 8GB RAM and 4 Cores CPU:

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any) 
6            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
29184         Ns
1            # of NBs
192           NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
2            Ps
2            Qs
16.0         threshold
1            # of panel fact
2            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0                               Number of additional problem sizes for PTRANS
1200 10000 30000                values of N
0                               number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64        values of NB

To tune the parameters, can reference the website here. It is not guaranteed to be the optimized setup. Try to tune the parameter by yourself.

The following parameters are probably you need to tune:
- Ps * Qs: the number of cores
- Ns: the problem size
- NBs: the block size
Run benchmark
```
mpirun -np <number of cores> ./xhpl
```

High Performance Linpack with AMD GPU

Prepare (Download the Dockerfile)

1. Build Dockerfile

sudo docker build -t rochpl -f Dockerfile .

2. Setup Docker image

Node A

docker save -o rochpl_image.tar rochpl
scp rochpl_image.tar user@10.0.0.12:~

Node B
```
docker load -i ~/rochpl_image.tar
```

Both nodes

sudo docker run --rm -it \
--device /dev/kfd \
--device /dev/dri \
--security-opt seccomp=unconfined \
--network=host \
--name=rochpl_node \
rochpl /bin/bash

Setup SSH keys

# Both Nodes
ssh-keygen -t rsa -f ~/.ssh/id_rsa -q -N ""

# Both Nodes
vim /etc/ssh/sshd_config 
# change the line --- PasswordAuthentication yes
# add this line --- PermitRootLogin yes

# Node A
ssh-copy-id -p 2222 root@10.0.0.12

# Node B
ssh-copy-id -p 2222 root@10.0.0.14

Add following to both nodes

vim ~/.ssh/config

Host 10.0.0.14
    Port 2222
    User root

Host 10.0.0.12
    Port 2222
    User root

Test if it works
```
ssh 10.0.0.14 hostname
```

3. Run HPL

Add the rochpl_hostfile on both node
```
10.0.0.14 slots=4
10.0.0.12 slots=4
```

Run HPL using this command (modify the arguments to suit your environment)

export OMPI_MCA_pmix=pmix

mpirun --hostfile rochpl_hostfile -np 8 --bind-to none   -x HIP_VISIBLE_DEVICES=0,1,2,3   --mca pml ucx --mca btl ^vader,tcp,openib,uct   ./run_rochpl -P 2 -Q 4 -N 256000 --NB 512

ARG UBUNTU_VERSION="jammy"

FROM ubuntu:${UBUNTU_VERSION}

ARG ROCM_URL="https://repo.radeon.com/amdgpu-install/6.1.1/ubuntu/jammy/amdgpu-install_6.1.60101-1_all.deb"
ARG UCX_BRANCH="v1.16.0"
ARG UCC_BRANCH="v1.3.0"
ARG OMPI_BRANCH="v5.0.3"
ARG APT_GET_APPS=""
ARG GPU_TARGET="gfx908,gfx90a,gfx942"

# Update and Install basic Linux development tools
RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        ca-certificates \
        git \
        ssh \
	openssh-client \
	openssh-server \
        make \
        vim \
        nano \
        libtinfo-dev\
        initramfs-tools \
        libelf-dev \
        numactl \
        curl \
        wget \
        tmux \
        build-essential \
        autoconf \
        automake \
        libtool \
        pkg-config \
        libnuma-dev \
        gfortran \
        flex \
        hwloc \
        libstdc++-12-dev \
        libxml2-dev \
        python3-dev \
        python3-pip \
        python3-distutils \
        unzip ${APT_GET_APPS}\
    && apt-get clean

RUN wget -qO- https://repo.radeon.com/rocm/rocm.gpg.key | gpg --dearmor | tee /etc/apt/trusted.gpg.d/rocm.gpg \
    && wget -O rocm.deb ${ROCM_URL} \
    && apt install -y ./rocm.deb \
    && amdgpu-install --usecase=rocm,hiplibsdk --no-dkms -y

RUN bash -c """IFS=',' read -r -a ARCH <<<${GPU_TARGET} \
        &&  for gpu_arch in \${ARCH[@]}; do \
            echo \$gpu_arch  >> /opt/rocm/bin/target.lst; \
        done""" \
    && chmod a+r /opt/rocm/bin/target.lst 

# # Requires cmake > 3.22 
RUN mkdir -p /opt/cmake  \
  && wget --no-check-certificate --quiet -O - https://cmake.org/files/v3.27/cmake-3.27.7-linux-x86_64.tar.gz | tar --strip-components=1 -xz -C /opt/cmake

ENV ROCM_PATH=/opt/rocm \
    UCX_PATH=/opt/ucx \
    UCC_PATH=/opt/ucc \
    OMPI_PATH=/opt/ompi \
    GPU_TARGET=${GPU_TARGET}

# Adding rocm/cmake to the Environment 
ENV PATH=$ROCM_PATH/bin:/opt/cmake/bin:$PATH \
    LD_LIBRARY_PATH=$ROCM_PATH/lib:$ROCM_PATH/lib64:$ROCM_PATH/llvm/lib:$LD_LIBRARY_PATH \
    LIBRARY_PATH=$ROCM_PATH/lib:$ROCM_PATH/lib64:$LIBRARY_PATH \
    C_INCLUDE_PATH=$ROCM_PATH/include:$C_INCLUDE_PATH \
    CPLUS_INCLUDE_PATH=$ROCM_PATH/include:$CPLUS_INCLUDE_PATH \
    CMAKE_PREFIX_PATH=$ROCM_PATH/lib/cmake:$CMAKE_PREFIX_PATH

# Create the necessary directory for SSH
RUN mkdir /var/run/sshd

# Set root password for login
RUN echo 'root:redhat' | chpasswd

# Allow root login and password authentication
RUN sed -i 's/#PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config && \
    sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \
    echo "StrictModes no" >> /etc/ssh/sshd_config

# Change the SSH port to 2222
RUN sed -i 's/#Port 22/Port 2222/' /etc/ssh/sshd_config

# Expose the new SSH port
EXPOSE 2222

# Start the SSH service and keep the container running
ENTRYPOINT service ssh restart && bash

WORKDIR /tmp


# Install UCX
RUN git clone https://github.com/openucx/ucx.git -b ${UCX_BRANCH} \
    && cd ucx \
    && ./autogen.sh \
    && mkdir build \
    && cd build \
    && ../contrib/configure-release --prefix=$UCX_PATH \
        --with-rocm=$ROCM_PATH \
        --without-knem \
        --without-xpmem  \
        --without-cuda \
        --enable-optimizations  \
        --disable-logging \
        --disable-debug \
        --disable-examples \
    && make -j $(nproc)  \
    && make install

# Install UCC
RUN git clone -b ${UCC_BRANCH} https://github.com/openucx/ucc \
    && cd ucc \
    && ./autogen.sh \
    && sed -i 's/memoryType/type/g' ./src/components/mc/rocm/mc_rocm.c \
    # offload-arch=native builds the local architecutre, which may not be present at build time for a container. 
    && sed -i 's/--offload-arch=native//g' ./cuda_lt.sh \
    && mkdir build \
    && cd build \
    && ../configure --prefix=${UCC_PATH} --with-rocm=${ROCM_PATH} --with-ucx=${UCX_PATH} --with-rccl=no  \
    && make -j $(nproc) \
    && make install


# Install OpenMPI
RUN git clone --recursive https://github.com/open-mpi/ompi.git -b ${OMPI_BRANCH} \
    && cd ompi \
    && ./autogen.pl \
    && mkdir build \
    && cd build \
    && ../configure --prefix=$OMPI_PATH --with-ucx=$UCX_PATH \
        --with-ucc=${UCC_PATH} \
        --enable-mca-no-build=btl-uct  \
        --without-verbs \
        --with-pmix=internal \
        --enable-mpi \
        --enable-mpi-fortran=yes \
        --disable-man-pages \
        --disable-debug \
    && make -j $(nproc) \
    && make install

# Adding OpenMPI, UCX, and UCC to Environment
ENV PATH=$OMPI_PATH/bin:$UCX_PATH/bin:$UCC_PATH/bin:$PATH \
    LD_LIBRARY_PATH=$OMPI_PATH/lib:$UCX_PATH/lib:$UCC_PATH/lib:$LD_LIBRARY_PATH \
    LIBRARY_PATH=$OMPI_PATH/lib:$UCX_PATH/lib:$UCC_PATH/lib:$LIBRARY_PATH \
    C_INCLUDE_PATH=$OMPI_PATH/include:$UCX_PATH/include:$UCC_PATH/include:$C_INCLUDE_PATH \
    CPLUS_INCLUDE_PATH=$OMPI_PATH/include:$UCX_PATH/include:$UCC_PATH/include:$CPLUS_INCLUDE_PATH \
    PKG_CONFIG_PATH=$OMPI_PATH/lib/pkgconfig:$UCX_PATH/lib/pkgconfig/:$PKG_CONFIG_PATH  \
    OMPI_ALLOW_RUN_AS_ROOT=1 \
    OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 \
    UCX_WARN_UNUSED_ENV_VARS=n

# Install Additional Apps Below
ARG HPL_BRANCH="main" 
 
WORKDIR /opt 
 
# Installing rocHPL 
RUN git clone -b ${HPL_BRANCH} https://github.com/ROCmSoftwarePlatform/rocHPL.git \ 
    && cd rocHPL \ 
    && ./install.sh \ 
        --prefix=/opt/rochpl \ 
        --with-rocm=/opt/rocm/ \ 
        --with-mpi=/opt/ompi \ 
    && rm -rf /tmp/rocHPL 
 
ENV PATH=$PATH:/opt/rochpl:/opt/rochpl/bin
ENV HIP_VISIBLE_DEVICES=0,1,2,3


#CMD ["/usr/sbin/sshd", "-D"]
CMD ["/bin/bash"]

Machine Learning

Computer Vision

Computer Vision

VGG16
- VGG16 has a total of 138 million parameters. The important point to note here is that all the conv kernels are of size 3x3 and maxpool kernels are of size 2x2 with a stride of two.
ResNet
- Resnet18 has around 11 million trainable parameters. It consists of CONV layers with filters of size 3x3 (just like VGGNet). Only two pooling layers are used throughout the network one at the beginning and the other at the end of the network. Identity connections are between every two CONV layers. The solid arrows show identity shortcuts where the dimension of the input and output is the same, while the dotted ones present the projection connections where the dimensions differ.
Architecture Differences:
- VGG16: VGG16 is a deep convolutional network with a straightforward and uniform architecture, consisting of 16 layers with very small (3x3) convolution filters. It is known for its simplicity and has been a popular choice for image classification tasks.
- ResNet: ResNet, particularly ResNet-50, uses residual connections that help mitigate the vanishing gradient problem, allowing for the training of much deeper networks. ResNet architectures are typically deeper and more complex than VGG16, which generally results in better feature extraction and higher accuracy in many tasks.
Performance:
- Accuracy: ResNet models, due to their depth and residual connections, generally outperform VGG16 in many image recognition tasks, including object detection. They are able to learn more complex features and provide better accuracy.
- Computation and Memory: ResNet models are usually more computationally expensive and require more memory compared to VGG16. This can be a consideration if you have limited computational resources.
Application in Object Detection:
- Object detection frameworks such as Faster R-CNN, SSD, and YOLO have utilized both VGG and ResNet as backbone feature extractors. In many cases, ResNet-based models have shown better performance in terms of both precision and recall.
- For instance, Faster R-CNN with a ResNet-50 or ResNet-101 backbone generally performs better than the same framework with a VGG16 backbone.

Practical Considerations:

ResNet Advantages:
- Better accuracy and feature representation due to deeper network architecture.
- Residual connections help in training deeper networks, resulting in improved performance.
VGG16 Advantages:
- Simpler architecture which can be easier to implement and train.
- Less computationally intensive compared to ResNet.

In general, ResNet models tend to be better than VGG16 for object detection tasks due to their superior feature extraction capabilities and higher accuracy. However, this comes at the cost of increased computational requirements.

If computational resources are not a constraint, it is recommended to use ResNet (e.g., ResNet-50 or ResNet-101) for better performance in object detection. However, if you need a simpler and less resource-intensive model, VGG16 is still a viable option and can achieve good results.

Henry's Notebook