Mastering the OCI CLI: What It Does, Tips, Tricks, and Gotchas

The Oracle Cloud Infrastructure (OCI) Command Line Interface (CLI) is a powerful tool for managing OCI resources programmatically. It allows developers, DevOps engineers, and administrators to interact with Oracle Cloud services using scripts or direct commands, streamlining tasks like provisioning compute instances, managing storage, or configuring Kubernetes clusters. This blog post explores what the OCI CLI does, along with essential tips, tricks, and common gotchas to help you use it effectively.

What Does the OCI CLI Do?

The OCI CLI is a Python-based command-line tool that provides a programmatic interface to Oracle Cloud Infrastructure services. It allows you to:

  • Manage Resources: Create, update, and delete resources such as virtual machines, block volumes, object storage, and Kubernetes clusters.
  • Automate Tasks: Integrate OCI operations into CI/CD pipelines, scripts, or automation workflows.
  • Access Advanced Features: Perform operations not always available in the OCI Console, such as fine-grained control over resources or bulk operations.
  • Debug and Monitor: Use the --debug flag to troubleshoot issues or inspect API calls.

The CLI supports a wide range of OCI services, including Compute, Networking, Storage, Identity, and Container Engine for Kubernetes (OKE). It’s particularly useful for automating repetitive tasks, managing large-scale deployments, or accessing features programmatically.

Tips and Tricks for Using the OCI CLI

1. Simplify Installation with the Right Python Version

The OCI CLI requires Python 3.11 to function correctly. Using Python 3.10, 3.12, or 3.13 will result in installation failures due to compatibility issues (more on this in the “Gotchas” section). Ensure you have Python 3.11 installed before proceeding.

Tip: Use a version manager like pyenv to manage multiple Python versions and avoid conflicts:

pyenv install 3.11.0
pyenv global 3.11.0

2. Quick Setup with Config File

After installing the OCI CLI, configure it using:

oci setup config

This command generates a configuration file at ~/.oci/config and prompts you for details like user OCID, tenancy OCID, region, and API key. You can also create the file manually for scripted setups.

Tip: Use multiple profiles in the config file to manage different OCI accounts:

[DEFAULT]
user=ocid1.user.oc1..<user-ocid>
fingerprint=<fingerprint>
key_file=~/.oci/oci_api_key.pem
tenancy=ocid1.tenancy.oc1..<tenancy-ocid>
region=us-phoenix-1

[SECOND_PROFILE]
user=ocid1.user.oc1..<another-user-ocid>
fingerprint=<another-fingerprint>
key_file=~/.oci/another_oci_api_key.pem
tenancy=ocid1.tenancy.oc1..<another-tenancy-ocid>
region=us-ashburn-1

Switch profiles with the --profile flag:

oci compute instance list --profile SECOND_PROFILE

3. Leverage JSON Output for Automation

The OCI CLI outputs results in JSON by default, making it ideal for scripting. Parse outputs with tools like jq:

oci compute instance list --compartment-id <compartment-ocid> | jq '.data[].id'

Tip: Use --output table for human-readable output during manual exploration:

oci compute instance list --output table

4. Use Environment Variables for Sensitive Data

Avoid hardcoding sensitive information in scripts. Set environment variables for credentials:

export OCI_CLI_USER=ocid1.user.oc1..<user-ocid>
export OCI_CLI_FINGERPRINT=<fingerprint>
export OCI_CLI_KEY_FILE=~/.oci/oci_api_key.pem
export OCI_CLI_TENANCY=ocid1.tenancy.oc1..<tenancy-ocid>
export OCI_CLI_REGION=us-phoenix-1

The CLI will automatically use these variables, reducing the risk of exposing credentials.

5. Debug with the --debug Flag

If a command fails, use the --debug flag to get detailed logs, including HTTP requests and responses:

oci ce cluster create-kubeconfig --cluster-id <cluster-ocid> --debug

This is invaluable for troubleshooting authentication or resource issues.

Gotchas to Watch Out For

1. Python Version Compatibility

The OCI CLI is strict about Python versions. Only Python 3.11 is supported. Using other versions results in errors:

    • Python 3.10 fails with a SameFileError during installation:
shutil.SameFileError: 'C:\\Program Files (x86)\\Oracle\\oci_cli\\Scripts\\create_backup_from_onprem.exe' and 'C:\\Program Files (x86)\\Oracle\\oci_cli\\scripts\\create_backup_from_onprem.exe' are the same file
    • Python 3.12 and 3.13 fail due to the removal of the distutils module:
ModuleNotFoundError: No module named 'distutils'

This issue has been reported to Oracle (see GitHub issue #968).

Workaround: Install Python 3.11 explicitly and verify the version:

python --version

2. Switching OCI Accounts

When switching between OCI accounts, the .oci directory in your home folder (~/.oci on Linux/Mac or C:\Users\<username>\.oci on Windows) can cause conflicts. If not updated, you may encounter 404 NotAuthorizedOrNotFound errors:

ServiceError: {"code": "NotAuthorizedOrNotFound", "message": "Authorization failed or requested resource not found."}

Workaround: Rename or remove the .oci directory before re-running oci setup config:

mv ~/.oci ~/.oci_backup
oci setup config

3. Cygwin Hangs with oci ce cluster create-kubeconfig

Running the oci ce cluster create-kubeconfig command in Cygwin can cause the command to hang after entering the passphrase:

Enter a passphrase for your private key ("N/A" for no passphrase): "N/A"

Workaround: Use PowerShell instead of Cygwin for this command:

oci ce cluster create-kubeconfig --cluster-id <cluster-ocid> --file $HOME/.kube/config --region us-phoenix-1 --token-version 2.0.0 --kube-endpoint PUBLIC_ENDPOINT

4. API Key Quota Limits

OCI imposes a limit of 3 API keys per user. Exceeding this limit results in an IdcsConversionError:

ServiceError: {"code": "IdcsConversionError", "message": "You can not create ApiKey as maximum quota limit of 3 has been reached.", "status": "400"}

Workaround: Remove unused API keys via the OCI Console:

  1. Navigate to Identity & Security > Users.
  2. Select your user and go to the API Keys section.
  3. Delete unnecessary keys to free up quota.

5. Config File Not Found in Cygwin

In Cygwin, you might see:

ERROR: Could not find config file at C:\Users\<username>\.oci\config

This happens because Cygwin uses a different home directory (/home/<username>). The CLI expects the config file in the Windows home directory.

Workaround: Copy the .oci directory to the Windows home directory or specify the config file explicitly:

oci --config-file /cygdrive/c/Users/<username>/.oci/config ...

Bonus: Advanced Tricks

1. Batch Operations with Scripts

Use loops in Bash or PowerShell to perform bulk operations. For example, to terminate multiple instances:

for instance in $(oci compute instance list --compartment-id <compartment-ocid> | jq -r '.data[].id'); do
    oci compute instance action --instance-id $instance --action TERMINATE
done

2. Use Aliases for Common Commands

Save time by creating aliases for frequently used commands:

alias oci-instances='oci compute instance list --compartment-id <compartment-ocid>'

Add this to your ~/.bashrc or ~/.zshrc for persistence.

3. Validate Inputs with --dry-run

Some OCI CLI commands support a --dry-run flag to simulate operations without making changes. Use it to verify your command before execution:

oci compute instance launch --dry-run --compartment-id <compartment-ocid> ...

Conclusion

The OCI CLI is a versatile tool for managing Oracle Cloud Infrastructure resources, but it comes with its share of nuances. By sticking to Python 3.11, managing your .oci directory carefully, and using PowerShell for specific commands like create-kubeconfig, you can avoid common pitfalls. Leverage tips like JSON parsing, environment variables, and debug mode to enhance your workflows. With these insights, you’ll be well-equipped to harness the full power of the OCI CLI for automation and resource management.

For more details, check the OCI CLI documentation or report issues on the OCI CLI GitHub.

Why I’m Rolling Back Security Patch 16.5.1 (c) on iPhone

UPDATE: I’ve since upgraded iOS to 18.3. After removing the device from Bluetooth and adding it back, the notifications worked again.

I seldom roll back patches. Especially security patches. I work in security. It’s essential to stay up-to-date with the latest security patches to protect our devices from potential threats. However, sometimes these patches can inadvertently introduce new issues, affecting the overall user experience. In this blog post, we’ll explore the reasons behind the decision to roll back the security patch 16.5.1 (c) on the iPhone, focusing on the specific issues it caused: the breakdown of Bluetooth texting with Microsoft Windows phone link and Tesla’s voice texting functionality.

The Importance of Security Patches:

Before we delve into the reasons for rolling back the security patch, let’s acknowledge the significance of timely updates. Security patches play a crucial role in safeguarding our devices against potential vulnerabilities, malware, and other cyber threats. They are essential for maintaining a secure and stable environment for users.

The Problematic Patch:

While patch 16.5.1 (c) aimed to improve the security of iPhones, it unintentionally disrupted certain functionalities, leading to frustration for many users. Two of the prominent issues experienced were the inability to use Microsoft Windows phone link for texting and a failure of voice texting in Tesla vehicles.

Bluetooth Texting with Microsoft Windows Phone Link:
Many iPhone users rely on Bluetooth connectivity to stay connected while on the move. The Microsoft Windows phone link provides seamless integration between iPhones and Windows PCs, allowing users to send and receive text messages directly from their computers. However, after applying security patch 16.5.1 (c), users noticed that this feature ceased to function as expected.

Rolling back the patch is essential to restore this valuable functionality, enabling users to stay productive and connected regardless of the device they are using.

Texting with Voice from Tesla Vehicles:
Voice-activated features have become increasingly popular in modern vehicles, enhancing convenience and safety while driving. Tesla’s voice texting functionality enabled iPhone users to send and receive texts hands-free, contributing to a safer driving experience. Unfortunately, after applying the security patch, this feature stopped working in Tesla vehicles.

With safety being a top priority, it’s crucial to resolve this issue promptly. By rolling back the patch, iPhone users can once again enjoy the convenience and safety of voice texting in their Tesla vehicles.

The Decision to Roll Back:

The decision to roll back the security patch was not taken lightly, considering the importance of maintaining a secure device. However, the issues faced by users with Bluetooth texting and voice texting in Tesla vehicles were significant enough to warrant action. The lack of these essential functionalities could hinder productivity, communication, and safety, potentially outweighing the security benefits of the patch.


Apple’s Commitment to Users:

As a tech giant, Apple is known for its dedication to providing a seamless user experience. In light of the reported issues, it is reasonable to expect that Apple will address the problems promptly and release a revised security patch that addresses the concerns without compromising on security.

Conclusion:

Staying on top of security updates is crucial in today’s digital landscape, but sometimes unforeseen issues can arise. The decision to roll back the security patch 16.5.1 (c) on the iPhone, which caused Bluetooth texting problems with Microsoft Windows phone link and disabled Tesla’s voice texting functionality, aims to prioritize user convenience and safety. Apple’s commitment to its users should lead to a swift resolution of the problems faced, ensuring a harmonious balance between security and seamless connectivity in future updates.

Navigating Cellular Roaming: Overcoming Signal Challenges and Troubleshooting Tips

Cellular roaming has revolutionized the way we stay connected while traveling abroad. However, even with advancements in technology, occasional hiccups can occur. We’ll delve into a personal experience with cellular roaming in Japan, where signal fluctuations and unexpected network transitions left me without data service. We’ll explore the potential causes of the issue, including multiple SIM cards, and discuss the simple yet effective troubleshooting technique that saved the day.

The Roaming Experience in Japan:
During my recent visit to Japan, I had the opportunity to experience cellular roaming firsthand. I was using SoftBank as my service provider, while my daughter’s phone was on Docomo. Initially, both phones provided seamless connectivity. However, I encountered an unexpected obstacle when I lost signal inside a train station, and my phone switched over to the AU network. To my dismay, despite having full signal bars, I found myself without data service or any functionality.

Two SIM Cards, One Phone: Unraveling the Mystery
Initially, I suspected that the presence of two SIM cards in my phone—a physical SIM card and an e-SIM—might have caused the connectivity issue. With this theory in mind, I promptly removed the physical SIM card, leaving only the e-SIM active. However, to my surprise, this did not resolve the problem. Even with the e-SIM as the sole active SIM, my phone remained unable to establish a connection on the AU network.

Exploring Cellular Options: Undeterred, I delved into my phone’s cellular options, hoping to find a resolution there. I experimented with different roaming providers listed in the settings, including unfamiliar names like KDDI and NT Docomo. Despite my attempts, none of these changes had any positive impact on my connectivity. The frustration of being unable to establish a data connection persisted.

Rebooting: The Troubleshooting Hero
Frustrated by the lack of connectivity, I decided to reboot the phone. To my relief, this simple action proved to be the solution. After restarting the device, it automatically reconnected to the available network, and I was once again able to access data services. It’s important to note that rebooting your device can often resolve various software-related issues, including those affecting cellular connectivity.

Understanding Signal Strength and Network Transitions:
The scenario described above highlights the importance of signal strength and network transitions in cellular roaming. While my phone displayed full signal bars on the AU network, the lack of data service indicated a weak or unstable signal. Network transitions occur when your phone automatically switches between different cellular networks to maintain connectivity. These transitions depend on factors such as signal strength, network availability, and agreements between your home network provider and the roaming network. In this case, the network transition to AU resulted in a temporary loss of service until the device was rebooted.

INTERESTING UPDATE: I noticed the phone roam into AU again during the day and it worked just fine. I’m guessing that I could’ve probably just rebooted at the time and it would’ve fixed it also.

Tips for a Smooth Roaming Experience:

Research and Preparation: Before traveling, research the available network providers in your destination, their coverage areas, and the agreements your home network provider has with them. This information will help you select the most suitable network option for your needs.

Enable Roaming and Check Settings: Ensure that your phone’s roaming settings are enabled before your departure. Additionally, periodically review and verify your network settings to ensure they are correctly configured for roaming.

Network Selection: Some phones allow you to manually select a preferred network. If you encounter connectivity issues or wish to prioritize a specific network, exploring this option might provide a solution.

Troubleshooting Techniques: In cases of connectivity issues, try simple troubleshooting techniques such as rebooting your device. Restarting can often resolve software-related problems and restore connectivity.

Conclusion:
Cellular roaming offers incredible convenience, allowing us to stay connected when traveling abroad. However, occasional signal challenges and network transitions can disrupt our connectivity. By understanding the factors influencing cellular roaming and employing effective troubleshooting techniques like rebooting, we can overcome these obstacles and enjoy a seamless roaming experience. So, stay informed, be prepared, and roam the world with confidence!

Troubleshooting a Mysterious Networking Issue in Windows 11 (NOT!)

Networking issues can be frustrating and time-consuming to troubleshoot. This was just one of my many experiences troubleshooting an interesting network issue that took me a while to solve.

The Problem: One day, I noticed that my computer’s network connection was acting up. The network interface card (NIC) was sending packets just fine, but it was receiving very few packets, and eventually, it would stop receiving packets altogether. At first, I suspected that the issue happened after I installed the Insider Preview of Windows 11, so I reset Windows. I updated the Realtek NIC driver to the latest version, hoping that it might help. The problem persisted.

The Troubleshooting: Next, I decided to reinstall Windows 11 from scratch, thinking that it might fix the issue. The problem still persisted even after the fresh install. Now I knew that the issue was likely to be hardware.

I boot into Linux from a USB drive. To my surprise, the issue persisted even in Linux. This ruled out any software or driver issues with Windows.

The Solution: I started to suspect that the issue might be with my Wi-Fi access point. I have a TP-Link Deco 6E mesh Wi-Fi system, and one of the access points acts as the main router. I decided to swap the problematic access point with another one, and to my relief, the issue disappeared instantly. My NIC was now sending and receiving packets normally, and I was back online.

Conclusion: Networking issues can be tricky to troubleshoot, and it’s easy to get lost in a sea of software and driver issues. Sometimes, the problem might not even be with your computer at all, but with your network equipment. If you’re experiencing a similar networking issue, try ruling out all software and driver issues first, and then focus on your network equipment. Hopefully, my experience will save you some time and frustration.

Watch out for Apparmor!

I’ve been hit by Apparmor a couple of times now. First with Samba, then with Openldap. AppArmor is a mandatory access control (MAC) system that restricts the capabilities of applications on a Linux system. While it can enhance the security of a Linux system, it can also cause issues with certain applications. Here are some apps that AppArmor can break and workarounds for each.

  1. Docker

Docker is a popular containerization technology that allows users to package and run applications in isolated environments. AppArmor can cause issues with Docker by blocking access to certain system resources required by Docker containers. To work around this issue, you can create a custom AppArmor profile for Docker that allows it to access the necessary resources.

To create a custom AppArmor profile for Docker, you can create a new profile file in the /etc/apparmor.d/ directory with the following contents:

# Profile for Docker
profile docker-container {
  # Allow access to necessary system resources
  /var/lib/docker/** rw,
  /var/run/docker.sock rw,
  /sys/fs/cgroup/** rw,
  /proc/sys/** rw,
  /etc/hostname r,
  /etc/hosts r,
  /etc/resolv.conf r,
  /etc/passwd r,
  /etc/group r,
  /etc/shadow r,
  /etc/gshadow r,
}

After creating the profile file, you can load it into the AppArmor kernel by running the following command:

sudo apparmor_parser -r /etc/apparmor.d/docker-container
  1. Apache

Apache is a widely used web server that can also be affected by AppArmor. If Apache is running in a restricted environment, it may not be able to access certain files or directories. To resolve this issue, you can modify the AppArmor profile for Apache to allow access to the necessary resources.

To modify the AppArmor profile for Apache, you can edit the existing profile file located in /etc/apparmor.d/usr.sbin.apache2 and add the necessary permissions. For example, to allow Apache to access the /var/www/html/ directory, you can add the following line to the profile:

/var/www/html/** r,

After making the necessary changes, you can reload the AppArmor profile by running the following command:

sudo service apparmor reload
  1. MySQL

MySQL is a popular open-source relational database management system that can be affected by AppArmor. If AppArmor is blocking access to MySQL, you may experience issues with database connectivity. To work around this issue, you can modify the AppArmor profile for MySQL to allow access to the necessary resources.

To modify the AppArmor profile for MySQL, you can edit the existing profile file located in /etc/apparmor.d/usr.sbin.mysqld and add the necessary permissions. For example, to allow MySQL to access the /var/lib/mysql/ directory, you can add the following line to the profile:

/var/lib/mysql/** rwk,

After making the necessary changes, you can reload the AppArmor profile by running the following command:

sudo service apparmor reload
  1. Nginx

Nginx is a high-performance web server that can also be affected by AppArmor. If Nginx is running in a restricted environment, it may not be able to access certain files or directories required for its operation. To resolve this issue, you can modify the AppArmor profile for Nginx to allow access to the necessary resources.

To modify the AppArmor profile for Nginx, you can edit the existing profile file located in /etc/apparmor.d/usr.sbin.nginx and add the necessary permissions. For example, to allow Nginx to access the /var/www/html/ directory, you can add the following line to the profile:

/var/www/html/** r,

After making the necessary changes, you can reload the AppArmor profile by running the following command:

sudo service apparmor reload
  1. OpenSSH

OpenSSH is a widely used remote access tool that can also be affected by AppArmor. If AppArmor is blocking access to OpenSSH, you may not be able to establish a remote connection to your Linux system. To work around this issue, you can modify the AppArmor profile for OpenSSH to allow access to the necessary resources.

To modify the AppArmor profile for OpenSSH, you can edit the existing profile file located in /etc/apparmor.d/usr.sbin.sshd and add the necessary permissions. For example, to allow OpenSSH to access the /var/log/auth.log file, you can add the following line to the profile:

/var/log/auth.log rw,

After making the necessary changes, you can reload the AppArmor profile by running the following command:

sudo service apparmor reload
  1. Samba

To modify the AppArmor profile for Samba, you can edit the existing profile file located in /etc/apparmor.d/usr.sbin.smbd and add the necessary permissions. For example, to allow Samba to access the /mnt/share/ directory, you can add the following line to the profile:

/mnt/share/** rw,

After making the necessary changes, you can reload the AppArmor profile by running the following command:

sudo service apparmor reload
  1. OpenLDAP

To modify the AppArmor profile for OpenLDAP, you can create a new profile file in the /etc/apparmor.d/ directory with the following contents:

# Profile for OpenLDAP
profile slapd {
  # Allow access to necessary system resources
  /var/lib/ldap/ r,
  /var/lib/ldap/** rw,
  /var/run/slapd/** rw,
  /etc/ldap/slapd.conf r,
  /etc/ldap/slapd.d/ r,
  /etc/ldap/slapd.d/** r,
  /usr/sbin/slapd mr,
  /usr/sbin/slapd.debug mr,
  /usr/sbin/slapd-{slave,monitor} ix,
  /usr/sbin/slapd.dbg mr,
  /usr/sbin/slapd-sock rw,
  /usr/sbin/slapd-sock-debug rw,
  /usr/sbin/slaptest mr,
}

After creating the profile file, you can load it into the AppArmor kernel by running the following command:

sudo apparmor_parser -r /etc/apparmor.d/slapd

By modifying AppArmor profiles for specific applications in this way, you can ensure that your applications have the necessary permissions to function correctly while still maintaining the security benefits of AppArmor.

AppArmor can cause issues with various applications on a Linux system, but these issues can usually be resolved by modifying the AppArmor profile for the affected application. By following the steps outlined above, you can ensure that your applications are functioning correctly while still maintaining the security benefits of AppArmor.

Loading up Active Directory with lots of groups

Loading up Active Directory with lots of groups can be a tedious task, but it can be made easier by following some steps. I recently had to do this to test a product to make sure that it can handle a large amount of data. I started with a list of job titles. Found that those titles were not enough groups and so ended up using a list of animals as the groups input to provide the script to automate the process of creating groups in Active Directory.

First, let’s assume that you already have Active Directory set up and that you have the necessary permissions to create groups. We will use the ldif template provided in the question to create groups in Active Directory.

Here is the step-by-step process to load up Active Directory with lots of groups:

  1. Prepare the list of groups: In our example, the list of animals is provided in the question. You can create your own list of groups based on your requirements.
  2. Create an ldif file: Use the ldif template provided in the question to create an ldif file that contains the group details. Make sure to replace {groupname} in the template with the actual name of the group.
  3. Run a for loop: To automate the process of creating groups, we can use a while loop that reads the list of groups and creates the groups in Active Directory using the ldif file. Here’s an example script:
#!/bin/bash

# Read the list of groups from a file
while read -r group; do
  # Replace {groupname} in the ldif file with the actual group name
  sed "s/{groupname}/$group/" group.ldif >> temp.ldif
  # Create the group in Active Directory using ldapadd command
  ldapadd -x -D "CN=Administrator,CN=Users,DC=mydomain,DC=com" -w password -f temp.ldif
done < groups.txt

In the above script, replace the following:

  • group.ldif with the name of the ldif file that you created in step 2.
  • groups.txt with the name of the file that contains the list of groups.
  • CN=Administrator,CN=Users,DC=mydomain,DC=com with the actual Distinguished Name (DN) of the user account that you want to use to create the groups.
  • password with the password for the user account.
  1. Run the script: Save the script to a file (e.g., create-groups.sh) and make it executable using the command chmod +x create-groups.sh. Then run the script using the command ./create-groups.sh.

That’s it! The script will create all the groups in the list and add them to Active Directory. You can modify the ldif template and the script as per your requirements to create groups with different attributes and properties.

irqbalance or set_irq_affinity – interesting cause for a network performance issue.

When it comes to high-performance computing, squeezing every bit of performance out of the system is crucial. One of the critical factors in achieving high performance is reducing system latency. Interrupt requests (IRQs) are a type of signal generated by hardware devices that require attention from the CPU. By default, IRQs can be delivered to any CPU core in a multi-core system. This can lead to cache misses and contention, ultimately leading to increased latency. Fortunately, there are tools available to help manage IRQ affinity and reduce latency, such as irqbalance and set_irq_affinity. https://github.com/majek/ixgbe/blob/master/scripts/set_irq_affinity

irqbalance is a Linux daemon that helps to balance IRQs across multiple CPU cores to reduce latency. By default, irqbalance distributes IRQs across all CPU cores, which is a good starting point. However, depending on the system configuration, it may be necessary to adjust IRQ affinity further to optimize performance.

set_irq_affinity is a script that allows users to set IRQ affinity for specific hardware devices. The script can be used to specify which CPU cores should receive IRQs for a specific hardware device, reducing the chance of cache misses and contention. Set_irq_affinity requires root access to run and must be executed for each device on the system.

To use set_irq_affinity, first, identify the device’s IRQ number using the “cat /proc/interrupts” command. Once the IRQ number has been identified, run the set_irq_affinity script, specifying the IRQ number and the desired CPU cores. For example, to set the IRQ affinity for IRQ 16 to CPU cores 0 and 1, run the following command:

sudo set_irq_affinity.sh 16 0-1
This command tells the kernel to route IRQ 16 to CPU cores 0 and 1.

Keep in mind that setting IRQ affinity is a delicate balance. Setting IRQ affinity for too few CPU cores can result in increased latency due to increased contention for those cores. On the other hand, setting IRQ affinity for too many CPU cores can result in inefficient cache usage and increased latency due to cache misses.

In summary, managing IRQ affinity is an important aspect of optimizing system performance, particularly in high-performance computing environments. The irqbalance daemon can help to balance IRQs across multiple CPU cores, while set_irq_affinity allows users to specify the IRQ affinity for specific hardware devices. By carefully managing IRQ affinity, users can reduce latency and achieve better system performance.

Clean up your old Kubernetes persistent data!

If you have ever removed a node from a Kubernetes cluster and then added it back, you may have encountered some issues with persistent data. Persistent data is any data that survives beyond the lifecycle of a pod, such as databases, logs, or configuration files. Kubernetes uses persistent volumes (PVs) and persistent volume claims (PVCs) to manage persistent data across the cluster.

However, sometimes these resources may not be cleaned up properly when a node is deleted or drained. This can cause problems when you try to reuse the node for another cluster or add it back to the same cluster. For example, you may see errors like:

  • Failed to attach volume \”pvc-1234\” on node \”node1\”: volume is already attached to node \”node2\”
  • Failed to mount volume \”pvc-5678\” on pod \”pod1\”: mount failed: exit status 32
  • Failed to create subPath directory for volumeMount \”data\” of container \”db\”: mkdir /var/lib/kubelet/pods/abcd-efgh/volumes/kubernetes.io~nfs/data: file exists

To avoid these issues, you need to clean up your old Kubernetes persistent data before adding a node back to a cluster. Here are some steps you can follow:

Step 1: Delete or unbind any PVCs associated with the node

The first step is to delete or unbind any PVCs that are associated with the node you want to remove. A PVC is a request for storage by a user or a pod. It binds to a PV that provides the actual storage backend. When you delete a PVC, it also releases the PV that it was bound to, unless the PV has a reclaim policy of Retain.

To list all the PVCs in your cluster, you can use the command:

kubectl get pvc --all-namespaces

To delete a PVC, you can use the command:

kubectl delete pvc <pvc-name> -n <namespace>

Alternatively, you can unbind a PVC from a PV without deleting it by editing the PVC spec and removing the volumeName field. This will make the PVC available for binding to another PV.

To edit a PVC, you can use the command:

kubectl edit pvc <pvc-name> -n <namespace>

Step 2: Delete any PVs that are not bound to any PVCs

The next step is to delete any PVs that are not bound to any PVCs. A PV is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using storage classes. It is a resource in the cluster just like a node. PVs have a lifecycle independent of any pod that uses them.

To list all the PVs in your cluster, you can use the command:

kubectl get pv

To delete a PV, you can use the command:

kubectl delete pv <pv-name>

Note that deleting a PV does not necessarily delete the underlying storage device or volume. Depending on the type of storage and the reclaim policy of the PV, you may need to manually delete the storage device or volume from your cloud provider or storage server.

Step 3: Delete any leftover data on the node

The final step is to delete any leftover data on the node that you want to remove. This may include directories or files that were created by Kubernetes or by your applications. For example, you may need to delete:

  • The /etc/cni/net.d directory that contains CNI (Container Network Interface) configuration files
  • The /var/lib/kubelet directory that contains kubelet data such as pods, volumes, plugins, etc.
  • The /var/lib/etcd directory that contains etcd data if the node was running an etcd member
  • The /var/lib/docker directory that contains docker data such as images, containers, volumes, etc.
  • Any other application-specific data directories or files that were mounted or created on the node

To delete these directories or files, you can use commands like:

sudo rm -rf /etc/cni/net.d
sudo rm -rf /var/lib/kubelet
sudo rm -rf /var/lib/etcd
sudo rm -rf /var/lib/docker
sudo rm -rf /path/to/your/application/data

Be careful when using these commands and make sure you are deleting only what you intend

Source: Conversation with Bing, 3/27/2023(1) A complete storage guide for your Kubernetes storage problems. https://bing.com/search?q=Kubernetes+persistent+data Accessed 3/27/2023.
(2) Persistent Volumes | Kubernetes. https://kubernetes.io/docs/concepts/storage/persistent-volumes/?ref=hack-slash Accessed 3/27/2023.
(3) Kubernetes Persistent Volumes – Ultimate Guide – Knowledge Base by …. https://phoenixnap.com/kb/kubernetes-persistent-volumes Accessed 3/27/2023.
(4) Data persistence on Kubernetes – SQL Server Big Data Clusters. https://learn.microsoft.com/en-us/sql/big-data-cluster/concept-data-persistence?view=sql-server-ver15 Accessed 3/27/2023.
(5) A complete storage guide for your Kubernetes storage problems. https://www.cncf.io/blog/2020/04/28/a-complete-storage-guide-for-your-kubernetes-storage-problems/ Accessed 3/27/2023.
(6) Data Persistence in Kubernetes | Kubernetes Volumes simply explained …. https://dev.to/techworld_with_nana/data-persistence-in-kubernetes-kubernetes-volumes-simply-explained-852 Accessed 3/27/2023.

Use a password manager!

In today’s digital age, where we have an online presence for almost everything, from social media to banking, it’s essential to keep our personal information secure. One of the most crucial aspects of online security is using strong and unique passwords for every website. However, with the growing number of online accounts, it can be challenging to remember all the passwords. That’s where password managers come in.

A password manager is a software that stores your passwords securely in an encrypted database. It creates and stores unique, strong passwords for every website you use, so you don’t have to remember them. Instead, you only need to remember one master password to access your password manager.

Using a password manager offers many benefits. Firstly, it eliminates the need to remember multiple passwords, which can be a daunting task, especially when you’re using complex passwords. Secondly, it saves you time since you don’t have to waste time resetting passwords or trying to remember them. Thirdly, it helps protect against phishing attacks, as the password manager only fills in passwords for legitimate websites. Finally, it provides an additional layer of security, as password managers generate random, complex passwords that are much harder to guess or crack.

While using a password manager is undoubtedly beneficial, it’s important to remember that it’s not a silver bullet for online security. It’s crucial to choose a strong and unique master password, preferably a passphrase that’s easy to remember but difficult for others to guess. You should also enable two-factor authentication, which requires you to enter a code sent to your phone or another device to access your account.

Another important aspect of online security is to never write down passwords or store them in unencrypted files. Writing down passwords and leaving them in plain sight can make it easy for someone to gain access to your accounts. If you must write down a password, store it in a secure location like a locked safe.

Finally, it’s important to use a different password for every website. This may seem like a hassle, but it’s crucial for security. If you use the same password for multiple accounts and a hacker gains access to one, they can easily access all your accounts. By using unique passwords for every website, you limit the damage that a data breach can cause.

Using a password manager is an excellent way to stay secure online. It eliminates the need to remember multiple passwords, saves time, and provides an extra layer of security. However, it’s important to use a strong and unique master password, enable two-factor authentication, and avoid writing down passwords. By taking these precautions, you can help protect yourself from the increasing number of online threats.

There are several popular password managers available, each with its own unique features and capabilities. Here are some examples of popular password managers:

  1. LastPass: LastPass is a popular password manager that offers both free and paid versions. It can generate strong, unique passwords and store them securely, as well as autofill login credentials on websites and applications.
  2. 1Password: 1Password is another popular password manager that offers features like password generation, secure storage, and autofill. It also includes a digital wallet for storing credit card information and secure notes.
  3. Dashlane: Dashlane is a user-friendly password manager that offers both free and paid versions. It can generate and store strong passwords, autofill login credentials, and provide secure sharing of passwords with trusted family and friends.
  4. KeePass: KeePass is a free, open-source password manager that allows you to store passwords in an encrypted database. It has plugins available for additional features and supports two-factor authentication.
  5. Bitwarden: Bitwarden is a free, open-source password manager that offers both desktop and mobile applications. It can generate strong passwords, store them securely, and autofill login credentials on websites and applications.
  6. MacPass: MacPass is a free, open-source password manager that is specifically designed for macOS. It stores passwords in an encrypted database and supports two-factor authentication.
  7. KeePassXC: KeePassXC is a community-driven, open-source password manager that is compatible with multiple platforms, including Windows, macOS, and Linux. It offers features like password generation, secure storage, and autofill.

There are many password managers available, each with its own unique features and benefits. It’s essential to choose a password manager that meets your specific needs and preferences to help keep your online accounts secure.

Help! OpenLDAP won’t start

slapd[5472]: main: TLS init def ctx failed: -1

I borrowed some information from here: https://apple.stackexchange.com/questions/107130/slapd-daemon-cant-start-tls-init-def-ctx-failed-1. Basically, just run slapd -d1 and see where the certificate is having trouble.

Crazily, before I bothered to check that, I just wiped my entire ldap server and rebuilt it. What’s even crazier is that after reinstalling, it never started either! Using CentOS 7, I removed the openldap-servers package and deleted the /var/lib/ldap and /etc/openldap directories. Installing the rpms recreated those directories, but did not rebuild the self-signed certificates in /etc/openldap/certs. I ended up finding this: 0006945: CentOS 6.5: /etc/openldap/certs/* missing – CentOS Bug Tracker. I guess there’s a post-script that should be running when the openssl-servers package gets installed, /usr/libexec/openldap/create-certdb.sh. By running it, it did create some certificates, but those didn’t allow the ldap server to start either.

Finally, I disabled SSL to fix it. These were the steps.

  1. Edit the /etc/openldap/slapd.d/cn/=config.ldif file. Remove anything that starts with olcTLS. There should be only a couple of lines.
  2. Then stop the server from starting in TLS. You may or may not need to do this. In /etc/sysconfig/slapd, if you have ldaps:///, you can remove that part so that the server won’t start in TLS.
  3. Finally, when you’re done with that, the LDAP server will start.

If you want to re-enable the TLS, you can follow these instructions to do it. Configure OpenLDAP over SSL/TLS [Step-by-Step] Rocky Linux 8 | GoLinuxCloud

You can also potentially run into this problem with SELinux or AppArmor. With Ubuntu and AppArmor, here’s how to get around it. https://askubuntu.com/questions/499164/slapd-tls-not-working-apparmor-errors

Hope this helps you!