AWS

Image Text/Face recognition with AWS Rekognition👀

Published Oct 31, 2019

Updated Feb 14, 2023

6 min read

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

AWS Rekognition

What is AWS Rekognition?

Rekognition is a AWS service that provides deep learning visual analysis for your images. Rekognition is very easy to integrate into your application by providing an image or video to the AWS Rekognition API. The service will identify some following: objects, people, text, scenes, and activities. "Amazon Rekognition also provides highly accurate facial analysis and facial recognition. You can detect, analyze, and compare faces for a wide variety of use cases, including user verification, cataloging, people counting, and public safety." - AWS Official Docs

Now let's start using AWS Rekognition

Let's start with trying some of their demos to see how AWS Recognition works.

Go to the following link and play with the demos.

Time to get our hands dirty

Warning🚨 :

You need to have an AWS Management Console account.
It will ask you for your credit card info, but YOU won't be charged for what you use in this tutorial since it's part of the FREE TIER.

Setting up our S3 Bucket

Go to Find services and look for S3
Click on CREATE A BUCKET
Enter the bucket name as thisdot-rk-YOUR_NAME
Click on NEXT twice
Uncheck all the boxes to grant public access to the bucket. Click NEXT

Note: I'm making this bucket public, because for the purpose of this tutorial, I'm not worried about security.

Setting up our Lambda Function

Go to Find services, and look for Lambda
Click on CREATE FUNCTION
For Lambda function name, enter thisdot-rk-YOUR_NAME
Under Runtime, click on the dropdown, and select Python 3.7.
Click on CREATE FUNCTION 6)Scroll down to where it says Function code. You should see something like this: 8)Erase everything you see in the editor, and paste the following code in there:

Note: Change the name of the bucket to the bucket name you created thisdot-rk-YOUR_NAME.

The following code is going to help us finding the text inside of the images. Using the .detect_text method.

 import json
 import boto3

 s3 = boto3.resource('s3')
 def lambda_handler(event, context):
  print(event)
  dump = json.loads(json.dumps(event))
  fileName = dump['image']
  print(fileName)

  bucket='thisdot-rk-YOUR_NAME'

  client=boto3.client('rekognition')
  text=client.detect_text(Image={'S3Object': 
  {'Bucket':bucket,'Name':str(fileName)}})
  res = {
   "textFound": text
  }
  return res

Note: To Learn more about other AWS Boto Rekognition functions, visit this website.

Scroll down to change the BASIC SETTINGS< of the lambda.
Change Memory to 512MB, and Timeout to 2min 30sec. This is to ensure your lambda doesn't timeout when processing images.
Scroll all the way to the top. In the upper right corner, you should see the SAVE button. Click on it.

Setting up our Security Roles Using IAM

Search for the IAM Service (Services IAM)
On the left navigation bar, click on ROLES.
You can select any lambda you have created to give it a specific role. In this tutorial, we will select the following to give it access to AWS Rekognition.

Then click on ATTACH POLICIES
Search for rekognition
Select AmazonRekognitionFUllAccess
Click on ATTACH POLICY Note: You can have multiple policies attached

Time to Test

Go back to your lambda function.
In the top right corner, select the dropdown that says "Select a test event"
Then select "Configure test events"
Give a name to your event

Then enter the following JSON object

     {
       "image": "thisdot.png"
     }

Where "thisdot.png" is the name of your image inside of your S3 bucket.

Click CREATE
On Your top right corner, you will see the TEST button. Click on it.
You should see a green box. Click on expand details.
Take a look at the response object. As you can see, it has found our text inside of the image.

Awesome!!!, right?

Now, let's jump to compare faces. Imagine you want to see if one person appears in the same picture. Rekognition can do this. Imagine all the possibilities!

Download these 2 images.

Source Image https://thepracticaldev.s3.amazonaws.com/i/ktpt1lx1ubzt3ilupph7.jpg

Target Image https://thepracticaldev.s3.amazonaws.com/i/p5j8z6hiey8z8rkspms6.jpg

Upload them to S3, following the steps previously mentioned.
Go back to your lambda, and create a new test, or edit the existing test. Your test will look like this:
```
   {
     "sourceImage": "source.jpg",
     "targetImage": "target.jpg"
   }
```

Then, we are going to modify our lambda code to be able to compare faces

 import json
 import boto3

 s3 = boto3.resource('s3')
 def lambda_handler(event, context):
  print(event)
  dump = json.loads(json.dumps(event))
  sourceImage = dump['sourceImage']
  targetImage = dump['targetImage']

  bucket='thisdot-rk-YOUR_NAME'

  client = boto3.client('rekognition')
  faceComparison= client.compare_faces(
  SourceImage={'S3Object': {'Bucket':bucket,'Name':str(sourceImage)}},
  TargetImage={'S3Object': {'Bucket':bucket,'Name':str(targetImage)}}
  )
  res = {
   "faceRecognition": faceComparison
  }
  return res

Look at the Execution results, and analyze the data.

Tell me what you think about this tutorial on twitter or comment below!

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

About the author(s)

Pato Vargas
Google Developer Expert on Angular and Web Technologies, Google Women Techmaker Ambassador, Auth0 Ambassador, Media Developer Expert at Cloudinary, Platzi Master Coach, and SpringBoard mentor.
@devpato @devpato

How to automatically deploy your full-stack JavaScript app with AWS CodePipeline

How to automatically deploy your full-stack JavaScript app from an NX monorepo with AWS CodePipeline In our previous blog post (How to host a full-stack JavaScript app with AWS CloudFront and Elastic Beanstalk) we set up a horizontally scalable deployment for our full-stack javascript app. In this article, we would like to show you how to set up AWS CodePipeline to automatically deploy changes to the application. APP Structure Our application is a simple front-end with an API back-end set up in an NX monorepo. The production built API code is hosted in Elastic Beanstalk, while the front-end is stored in S3 and hosted through CloudFront. Whenever we are ready to make a new release, we want to be able to deploy the new API and front-end versions to the existing distribution. In this article, we will set up a CodePipeline to deploy changes to the main branch of our connected repository. CodePipeline CodeBuild and the buildspec file First and foremost, we should set up the build job that will run the deploy logic. For this, we are going to need to use CodeBuild. Let's go into our repository and set up a build-and-deploy.buildspec.yml file. We put this file under the tools/aws/ folder. ` This buildspec file does not do much so far, we are going to extend it. In the installation phase, it will run npm ci to install the dependencies and in the build phase, we are going to run the build command using the ENVIRONMENT_TARGET variable. This is useful, because if you have more environments, like development and staging you can have different configurations and builds for those and still use the same buildspec file. Let's go to the Codebuild page in our AWS console and create a build project. Add a descriptive name, such as your-appp-build-and-deploy. Please provide a meaningful description for your future self. For this example, we are going to restrict the number of concurrent builds to 1. The next step is to set up the source for this job, so we can keep the buildspec file in the repository and make sure this job uses the steps declared in the yaml file. We use an access token that allows us to connect to GitHub. Here you can read more on setting up a GitHub connection with an access token. You can also connect with Oauth, or use an entirely different Git provider. We set our provider to GitHub and provided the repository URL. We also set the Git clone depth to 1, because that makes checking out the repo faster. In the Environment section, we recommend using an AWS CodeBuild managed image. We use the Ubuntu Standard runtime with the aws/codebuild/standard:7.0 version. This version uses Node 18. We want to always use the latest image version for this runtime and as the Environment type we are good with Linux EC2. We don't need elevated privileges, because we won't build docker images, but we do want to create a new service role. In the Buildspec section select Use a buildspec file and give the path from your repository root as the Buildspec name. For our example, it is tools/aws/build-and-deploy.buildspec.yml. We leave the Batch configuration and the Artifacts sections as they are and in the Logs section we select how we want the logs to work. For this example, to reduce cost, we are going to use S3 logs and save the build logs in the aws-codebuild-build-logs bucket that we created for this purpose. We are finished, so let's create the build project. CodePipeline setup To set up automated deployment, we need to create a CodePipeline. Click on Create pipeline and give it a name. We also want a new service role to be created for this pipeline. Next, we should set up the source stage. As the source provider, we need to use GitHub (version2) and set up a connection. You can read about how to do it here. After the connection is set up, select your repository and the branch you want to deploy from. We also want to start the pipeline if the source code changes. For the sake of simplicity, we want to have the Output artefact format as CodePipeline default. At the Build stage, we select AWS CodeBuild as the build provider and let's select the build that we created above. Remember that we have the ENVIRONMENT_TARGET as a variable used in our build, so let's add it to this stage with the Plaintext value prod. This way the build will run the build:prod command from our package.json. As the Build type we want Single build. We can skip the deployment stage because we are going to set up deployment in our build job. Review our build pipeline and create it. After it is created, it will run for the first time. At this time it will not deploy anything but it should run successfully. Deployment prerequisites To be able to deploy to S3 and Elastic Beanstalk, we need our CodeBuild job to be able to interact with those services. When we created the build, we created a service role for it. In this example, the service role is codebuild-aws-test-build-and-deploy-service-role. Let's go to the IAM page in the console and open the Roles page. Search for our codebuild role and let's add permissions to it. Click the Add permissions button and select Attach policies. We need two AWS-managed policies to be added to this service role. The AdministratorAccess-AWSElasticBeanstalk will allow us to deploy the API and the AmazonS3FullAccess will allow us to deploy the front-end. The CloudFrontFullAccess will allow us to invalidate the caches so CloudFront will send the new front-end files after the deployment is ready. Deployment Upload the front-end to S3 Uploading the front-end should be pretty straightforward. We use an AWS CodeBuild managed image in our pipeline, therefore, we have access to the aws command. Let's update our buildspec file with the following changes: ` First, we upload the fresh front-end build to the S3 bucket, and then we invalidate the caches for the index.html file, so CloudFront will immediately serve the changes. If you have more static files in your app, you might need to invalidate caches for those as well. Before we push the above changes up, we need to update the environment variables in our CodePipeline. To do this open the pipeline and click on the Edit button. This will then enable us to edit the Build stage. Edit the build step by clicking on the edit button. On this screen, we add the new environment variables. For this example, it is aws-hosting-prod as Plaintext for the FRONT_END_BUCKET and E3FV1Q1P98H4EZ as Plaintext for the CLOUDFRONT_DISTRIBUTION_ID Now if we add changes to our index.html file, for example, change the button to HELLO 2, commit it and push it. It gets deployed. Deploying the API to Elastic Beanstalk We are going to need some environment variables passed down to the build pipeline to be able to deploy to different environments, like staging or prod. We gathered these below: - COMMIT_ID: #{SourceVariables.CommitId} - This will have the commit id from the checkout step. We include this, so we can always check what commit is deployed. - ELASTIC_BEANSTALK_APPLICATION_NAME: Test AWS App - This is the Elastic Beanstalk app which has your environment associated. - ELASTIC_BEANSTALK_ENVIRONMENT_NAME: TestAWSApp-prod - This is the Elastic Beanstalk environment you want to deploy to - API_VERSION_BUCKET: elasticbeanstalk-us-east-1-474671518642 - This is the S3 bucket that was created by Elastic Beanstalk With the above variables, we can make some new variables during the build time, so we can make sure that every API version is unique and gets deployed. We set this up in the install phase. ` The APP_VERSION variable is the version property from the package.json file. In a release process, the application's version is stored here. The API_VERSION variable will contain the APP_VERSION and as a suffix, we include the build number. We want to upload this API version by indicating the commit ID, so the API_ZIP_KEY will have this information. The APP_VERSION_DESCRIPTION will be the description of the deployed version in Elastic Beanstalk. Finally, we are going to update the buildspec file with the actual Elastic Beanstalk deployment steps. ` Let's make a change in the API, for example, the message sent back by the /api/hello endpoint and push up the changes. --- Now every time a change is merged to the main branch, it gets pushed to our production deployment. Using these guides, you can set up multiple environments, and you can configure separate CodePipeline instances to deploy from different branches. I hope this guide proved to be helpful to you....

Sep 15, 2023

9 mins

AWSNxNodeJSJavaScript

How to set up local cloud environment with LocalStack

How to set up local cloud environment with LocalStack Developers enjoy building applications with AWS due to the richness of their solutions to problems. However, testing an AWS application during development without a dedicated AWS account can be challenging. This can slow the development process and potentially lead to unnecessary costs if the AWS account isn't properly managed. This article will examine LocalStack, a development framework for developing and testing AWS applications, how it works, and how to set it up. Assumptions This article assumes you have a basic understanding of: - AWS: Familiarity with S3, CloudFormation, and SQS. - Command Line Interface (CLI): Comfortable running commands in a terminal or command prompt. - JavaScript and Node.js: Basic knowledge of JavaScript and Node.js, as we will write some code to interact with AWS services. - Docker Concepts: Understanding of Docker basics, such as images and containers, since LocalStack runs within a Docker container. What is LocalStack? LocalStack is a cloud service emulator that runs in a single container on your laptop or in your CI environment. With LocalStack, you can run your AWS applications or Lambdas entirely on your local machine without connecting to a remote cloud provider! Whether you are testing complex CDK applications or Terraform configurations or just beginning to learn about AWS, LocalStack simplifies your testing and development workflow, relieving you from the complexity of testing AWS applications. Prerequisite Before setting up LocalStack, ensure you have the following: 1. Docker Installed: LocalStack runs in a Docker container, so you need Docker installed on your machine. You can download and install Docker from here. 2. Node.js and npm: Ensure you have Node.js and npm installed, as we will use a simple Node.js application to test AWS services. You can download Node.js from here. 3. Python: Python is required for installing certain CLI tools that interact with LocalStack. Ensure you have Python 3 installed on your machine. You can download Python from here. Installation In this article, we will use the LocalStack CLI, which is the quickest way to get started with LocalStack. It allows you to start LocalStack from your command line. Localstack spins up a Docker instance, Alternative methods of managing the LocalStack container exist, and you can find them here. To install LocalStack CLI, you can use homebrew by running the following command: ` If you do not use a macOS or you don’t have Brew installed, you can install the CLI using Python: ` To confirm your installation was successful, run the following: ` That should output the installed version of LocalStack. Now you can start LocalStack by running the following command: ` This command will start LocalStack in docker mode, and since it is your first installation, try to pull the LocalStack image. You should see the below on your terminal After the image is downloaded successfully, the docker instance is spun up, and LocalStack is running on your machine on port 4566. Testing AWS Services with LocalStack LocalStack lets you easily test AWS services during development. It supports many AWS products but has some limitations, and not all features are free. Community Version: Free access to core AWS products like S3, SQS, DynamoDB, and Lambda. Pro Version: Access to more AWS products and enhanced features. Check the supported community and pro version resources for more details. We're using the community edition, and the screenshot below shows its supported products. To see the current products supported in the community edition, visit http://localhost:4566/_localstack/health. This article will test AWS CloudFormation and SQS. Before we can start testing, we need to create a simple Node.js app. On your terminal, navigate to the desired folder and run the following command: ` This command will create a package.json file at the root of the directory. Now we need to install aws-sdk. Run the following command: ` With that installed, we can now start testing various services. AWS CloudFormation AWS CloudFormation is a service that allows users to create, update, and delete resources in an AWS account. This service can also automate the process of provisioning and configuring resources. We are going to be using LocalStack to test creating a CloudFormation template. In the root of the folder, create a file called cloud-formation.js. This file will be used to create a CloudFormation stack that will be used to create an S3 bucket. Add the following code to the file: ` In the above code, we import the aws-sdk package, which provides the necessary tools to interact with AWS services. Then, an instance of the AWS.CloudFormation class is created. This instance is configured with: region: The AWS region where the requests are sent. In this case, us-east-1 is the default region for LocalStack. endpoint: The URI to send requests to is set to http://localhost:4566 for LocalStack. You should configure this with environment variables to switch between LocalStack for development and the actual AWS endpoint for production, ensuring the same code can be used in both environments. credentials: The AWS credentials to sign requests with. We are passing new AWS.Credentials("test", "test") The params object defines the parameters needed to create the CloudFormation stack: StackName: The name of the stack. Here, we are using 'test-local-stack'. TemplateBody: A JSON string representing the CloudFormation template. In this example, it defines a single resource, an S3 bucket named TestBucket. The createStack method is called on the CloudFormation client with the params object. This method attempts to create the stack. If there is an error, we log it to the console else, we log the successful data to the console. Now, let’s test the code by running the following command: ` If we run the above command, we should see the JSON response on the terminal. ` AWS SQS The process for testing SQS with LocalStack using the aws-sdk follows the same pattern as above, except that we will introduce another CLI package, awslocal. awslocal is a thin wrapper and a substitute for the standard aws command, enabling you to run AWS CLI commands within the LocalStack environment without specifying the --endpoint-url parameter or a profile. To install awslocal, run the following command on your terminal: ` Next, let’s create an SQS queue using the following command: ` This will create a queue name test-queue and return queueUrl like below: ` Now, in our directory, let’s create a sqs.js file, and inside of it, let’s paste the following code: ` In the above code, an instance of the AWS.SQS class is created. The instance is configured with the same parameters as when creating the CloudFormation. We also created a params object which had the required properties needed to send a SQS message: QueueUrl: The URL of the Amazon SQS queue to which a message is sent. In our case, it will be the URL we got when we created a local SQS. Make sure to manage this in environment variables to switch between LocalStack for development and the actual AWS queue URL for production, ensuring the same code can be used in both environments. MessageBody: The message to send. We call the sendMessage method, passing the params object and a callback that handles error and data, respectively. Let’s run the code using the following command: ` We should get a JSON object in the terminal like the following: ` To test if we can receive the SQS message sent, let’s create a sqs-receive.js. Inside the file, we can copy over the AWS.SQS instance that was created earlier into the file and add the following code: ` Run the code using the following command: ` We should receive a JSON object and should be able to see the previous message we sent ` When you are done with testing, you can shut down LocalStack by running the following command: ` Conclusion In this article, we looked at how to set up a local cloud environment using LocalStack, a powerful tool for developing and testing AWS applications locally. We walked through the installation process of LocalStack and demonstrated how to test AWS services using the AWS SDK, including CloudFormation and SQS. Setting up LocalStack allows you to simulate various AWS services on your local machine, which helps streamline development workflows and improve productivity. Whether you are testing simple configurations or complex deployments, LocalStack provides the environment to ensure your applications work as expected before moving to a production environment. Using LocalStack, you can confidently develop and test your AWS applications without an active AWS account, making it an invaluable tool for developers looking to optimize their development process....

Jun 19, 2024

6 mins

AWSDevOps

Closer look at the DNA of the OpenFin Platform API

This blog takes a deep dive into the newly launched Platform API by OpenFin. After only a few days of playing around with the software, I realized just how much capacity it has and how many good features there are for developers. Ultimately, this robust set of features will significantly enhance your user’s experience. Before I begin, let's start with a bit of background information on OpenFin! Dictionary API: Application Programming Interface OS: Operating System CSS: Cascading Style Sheets What is OpenFin? Openfin is a tech company focused on modernizing desktops, and accelerating innovation in the financial sector. Simply put, OpenFin is the Operating System of Finance! With this, you get the power you need, the freedom you want, and the security you must have. The Problem If you are in the financial sector, you know that it is very important to be able to visualize everything on one screen when interacting with multiple applications. We usually tend to arrange windows over and over, but that takes time, and our applications do not work with each other, nor do they share all data between them by nature. Most importantly, we have to try to make sure all of these apps are secure! The Solution Platform API of course! What is the OpenFin Platform API? It's a software that will help you build desktop platforms at the speed of light. The Platform API will also facilitate the work of creating a merged user experience across the multiple applications. > “The Platform API is for central architecture teams who want to provide web apps with a unified desktop experience and common look & feel.” - OpenFin Engineer Key Features of the OpenFin Platform API - Layout management (e.g. window drag-and-drop and tabbing) - Customization of window look & feel - Styling via CSS - URL for loading the title bar window - Customization of all Platform APIs (behaviors) - Save and restore your window view. - Window level context (different from FDC3) - “Smart” Manifests to describe platforms via a single .JSON file The Powerful Gridlayout One of my favorite features is their grid layout. This feature has helped me reduce the amount of time it takes to develop an app. It can get pretty complicated to create dynamic grids that work with internal and external windows, by dragging and dropping. Now, if you see it from the end user point of view, this is an awesome idea, because the grid is customizable! Now, I know what you are thinking. And no, you don't need to ask the developer to change the layout of the application. You, as an end user, can change the layout as well. This gives every end user the opportunity to have customs views of their apps that best fit their needs, and grow their productivity. As a developer, I believe this is a huge benefit, since I don't have to worry about writing the code for this dynamic grid, nor do I need to worry about customizing the layout for each end user or client, which allows me to focus on the actual applications that will be used inside of the Platform API. Because a Grid layout is not enough The Platform API gives you the ability to power up your platform not only with custom layouts, but also with tabs! As a developer, I can develop my applications used inside of the platform with the assurance that they can be grouped together on tabs. And one of the coolest things is that you can customize them! If you are an end user of the platform, there are so many benefits here. E.g You can group the tabs by colors, where each color represents windows that belong to a certain group. This is huge. I have seen monitors of people working in the financial sector with 20 open windows and sometimes, users get lost in this. It's hard to manage what's going on. Your perfect setup...always So while working with the Platform API, I found out that you can save the current platform setup. This is an amazing feature. When working with dynamic layout, having to re-arrange things every time the code compiles can become very tedious. Now, imagine the benefits of this for the end users! As a developer, you can easily retrieve the existing snapshot of your saved platform by using the applySnapshot method. ` Thanks to this, you don't have to worry about losing the perfect setup that took you time to arrange. The setup will be always the same as long as you want to apply the saved snapshot! Advanced workflows The Platform API allows you to get the current context of your window. Thanks to this, you can easily save it into the platform's snapshots to re-use the context when the aplySnapShot method is called. The Core Now, let's take a closer look at the core of OpenFin’s Platform API and dive into some code examples. What is the core? It’s the manifest! I like to refer to it as the core because it is what carries all the information which constructs your Platform API project. The manifest is located inside of a .json file AKA the app.json Let’s Get Started Let's create our manifest: ` As you can see, this is the beginning of a new project using the Platform API. All you have to do is declare the "platform" object in your app.json. Now let’s dive into the features to customize the application experience. Customizing the Platform API Window Customize your window's look and feel using css, and by adding defaultWindowOptions. You manifest will look as follows: ` Take a look at this file to see what css selectors are available in the Platform API. You can also replace the default windows that come with the Platform API. To do this, specify the url property as a window option in your manifest. You can import your custom HTML as follows: ` When working with your custom window, all you have to do is consider the following: > This HTML file must specify a div component with the ID layout-container where you want the layout to be rendered. This will ensure that the window has a target to render the layout in. A url can also be specified in windowOptions in a snapshot, or when launching a snapshot via other methods. Window Commands OpenFin enables your Platform API application to work and feel like a native desktop application. That's why Openfin engineers further enhanced this experience by adding commands (with appropriate hotkeys) to help improve user experience. These commands can be added to the platform object inside of your Platform API manifest. ` Window Snapshot Another important property of the manifest is the snapshot. This property defines the structure of your window inside of the Platform. The snapshot needs to contain the window property where we will define the objects that go inside of it like *views*, and you can even define the structure of the *grid* by the layout property each window has. One cool feature about windows is that they can be created and destroyed by the end user, or developer, at any time. ` Window Layout This property defines the structure of your window. The layout works on a grid system. When working with the layouts, you have to add the content property inside of the layouts property. This content property contains an inner property called type. The values inside of the type value are the following: - row - column - stack - component In the following code snippet, you can see how I'm using the the content property with the value stack as my type value. Another thing to notice is that there's content inside of other content. The Platform API allows us to have nested content to have the ability to give our window the structure we want. ` View ComponentState Finally, another property that is worth mentioning is the componentState. This property gives us the option to provide more information about our view. Let's take a look at the following example. ` This view will render the website of https://www.thisdot.co inside of the view. Take a look to this complete example: ` If you want to learn more about the manifest and the Platform API, take a look at the official resources: - https://developers.openfin.co/docs/platform-api - https://cdn.openfin.co/docs/javascript/canary/View.html#~options - https://cdn.openfin.co/docs/javascript/canary/Window.html#~options - https://cdn.openfin.co/docs/javascript/stable/Platform.html Conclusion Working with Platform API has so many wonderful benefits. It gives me the opportunity to create more flexible software with consistent design, better user experience, and greater security. The Platform API has helped me deliver products faster, with better quality, without compromising the security of my software. OpenFin is changing the way we interact with financial software. Don’t miss your chance to use it!...

Mar 23, 2020

8 mins

OpenFin

Roo Custom Modes

Roo Custom Modes Roo Code is an extension for VS Code that provides agentic-style AI code editing functionality. You can configure Roo to use any LLM model and version you want by providing API keys. Once configured, Roo allows you to easily switch between models and provide custom instructions through what Roo calls "modes." Roo Modes can be thought of as a "personality" that the LLM takes on. When you create a new mode in Roo, you provide it with a description of what personality Roo should take on, what LLM model should be used, and what custom instructions the mode should follow. You can also define workspace-level instructions via a .roo/rules-{modeSlug}/ directory at your project root with markdown files inside. Having different modes allows developers to quickly fine-tune how the Roo Code agent performs its tasks. Roo ships out-of-the-box with some default modes: Code Mode, Architect Mode, Ask Mode, Debug Mode, and Orchestrator Mode. These can get you far, but I have expanded on this list with a few custom modes I have made for specific scenarios I run into every day as a software engineer. My Custom Modes 📜 Documenter Mode I created this mode to help me with generating documentation for legacy codebases my team works with. I use this mode to help produce documentation interactively with me while I read a codebase. Mode Definition You are Roo, a highly skilled technical documentation writer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer, and your responsibility is to provide documentation around the code you are working on. You will be asked to provide documentation in the form of comments, markdown files, or other formats as needed. Mode-specific Instructions You will respect the following rules: * You will not write any code, only markdown files. * In your documentation, you will provide references to specific files and line numbers of code you are referencing. * You will not attempt to execute any commands. * You will not attempt to run the application in the browser. * You will only look at the code and infer functionality from that. 👥 Pair Programmer Mode I created a “Pair Programmer” mode to serve as my personal coding partner. It’s designed to work in a more collaborative way with a human software engineer. When I want to explore multiple ideas quickly, I switch to this mode to rapidly iterate on code with Roo. In this setup, I take on the role of the navigator—guiding direction, strategy, and decisions—while Roo handles the “driving” by writing and testing the code we need. Mode Definition You are Roo, a highly skilled software engineer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer who will be checking your work and providing instructions. If you get stuck, ask for help and we will solve problems together. Mode-specific Instructions You will respect the following rules: * You will not install new 3rd party libraries without first providing usage metrics (stars, downloads, latest version update date). * You will not do any additional tasks outside of what you have been told to do. * You will not assume to do any additional work outside of what you have been instructed to do. * You will not open the browser and test the application. Your pairing partner will do that for you. * You will not attempt to open the application or the URL at which the application is running. Assume your pairing partner will do that for you. * You will not attempt to run npm run dev or similar commands. Your pairing partner will do that for you. * You will not attempt to run a development server of any kind. Your pairing partner will handle that for you. * You will not write tests unless instructed to. * You will not make any git commits unless explicitly told to do so. * You will not make suggestions of commands to run the software or execute the test suite. Assume that your human counterpart has the application running and will check your work. 🧑‍🏫 Project Manager I created this mode to help me write tasks for my team with clear and actionable acceptance criteria. Mode Definition You are a professional project manager. You are highly skilled in breaking down large tasks into bite-sized pieces that are actionable by an engineering team or an LLM performing engineering tasks. You analyze features carefully and detail out all edge cases and scenarios so that no detail is missed. Mode-specific Instructions Think creatively about how to detail out features. Provide a technical and business case explanation about feature value. Break down features and functionality in the following way. The following example would be for user login: User Login: As a user, I can log in to the application so that I can make changes. This prevents anonymous individuals from accessing the admin panel. Acceptance Criteria * On the login page, I can fill in my email address: * This field is required. * This field must enforce email format validation. * On the login page, I can fill in my password: * This field is required. * The input a user types into this field is hidden. * On failure to log in, I am provided an error dialog: * The error dialog should be the same if the email exists or not so that bad actors cannot glean info about active user accounts in our system. * Error dialog should be a red box pinned to the top of the page. * Error dialog can be dismissed. * After 4 failed login attempts, the form becomes locked: * Display a dialog to the user letting them know they can try again in 30 minutes. * Form stays locked for 30 minutes and the frontend will not accept further submissions. 🦾 Agent Consultant I created this mode for assistance with modifying my existing Roo modes and rules files as well as generating higher quality prompts for me. This mode leverages the Context7 MCP to keep up-to-date with documentation on Roo Code and prompt engineering best practices. Mode Definition You are an AI Agent coding expert. You are proficient in coding with agents and defining custom rules and guidelines for AI powered coding agents. Your specific expertise is in the Roo Code tool for VS Code are you are exceptionally capable at creating custom rules files and custom mode. This is your workflow that you should always follow: 1. 1. Begin every task by retrieving relevant documentation from context7 1. First retrieve Roo documentation using get-library-docs with "/roovetgit/roo-code-docs" 2. Then retrieve prompt engineering best practices using get-library-docs with “/dair-ai/prompt-engineering-guide" 2. Reference this documentation explicitly in your analysis and recommendations 3. Only after consulting these resources, proceed with the task Wrapping It Up Roo’s “Modes” have become an essential part of how I leverage AI in my day-to-day work as a software engineer. By tailoring each mode to specific tasks—whether it’s generating documentation, pairing on code, writing project specs, or improving prompt quality—I’ve been able to streamline my workflow and get more done with greater clarity and precision. Roo’s flexibility lets me define how it should behave in different contexts, giving me fine-grained control over how I interact with AI in my coding environment. Roo also has the capability of defining custom modes per project if that is needed by your team. If you find yourself repeating certain workflows or needing more structure in your interactions with AI tools, I highly recommend experimenting with your own custom modes. The payoff in productivity and developer experience is absolutely worth it....

Jun 13, 2025

6 mins

AIRoo Code

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

Image Text/Face recognition with AWS Rekognition👀

AWS Rekognition

Now let's start using AWS Rekognition

Time to get our hands dirty

Setting up our S3 Bucket

Setting up our Lambda Function

Setting up our Security Roles Using IAM

Time to Test

Awesome!!!, right?

Pato Vargas

You might also like

How to automatically deploy your full-stack JavaScript app with AWS CodePipeline

How to set up local cloud environment with LocalStack

Closer look at the DNA of the OpenFin Platform API

Roo Custom Modes

Let's innovate together!

You might also like

How to automatically deploy your full-stack JavaScript app with AWS CodePipeline

How to set up local cloud environment with LocalStack

Closer look at the DNA of the OpenFin Platform API

Roo Custom Modes