General

ChatGPT can't solve these problems! with Chris Gardner

Published Jan 16, 2024

Updated Jan 16, 2024

1 min read

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

Chris Gardner talks neural networks, covering fundamental concepts, historical development, and practical applications. It’s important to understand the difference between artificial intelligence (AI) and machine learning, and the role of neural networks in solving intricate problems with vast datasets.

This conversation centers around the intricacies of training neural networks, particularly as they become more complex with multiple layers. Chris and Rob touch on the fascinating yet daunting nature of neural networks, discussing their ability to learn beyond human comprehension.

Turning to the practical side of using neural networks, Chris shares the existence of libraries that exist to simplify the process of building a network, enabling users to input data, specify parameters, and entrust the system with the training.

Both caution about the biases inherent in the data and the responsibility associated with working on machine learning models. They address challenges related to ethics, highlighting the difficulties in identifying biases and emphasizing the delicate balance between excitement and caution in the evolving field of machine learning.

Listen to the podcast here: https://modernweb.podbean.com/e/chris-gardner/

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

Roo Custom Modes

Roo Custom Modes Roo Code is an extension for VS Code that provides agentic-style AI code editing functionality. You can configure Roo to use any LLM model and version you want by providing API keys. Once configured, Roo allows you to easily switch between models and provide custom instructions through what Roo calls "modes." Roo Modes can be thought of as a "personality" that the LLM takes on. When you create a new mode in Roo, you provide it with a description of what personality Roo should take on, what LLM model should be used, and what custom instructions the mode should follow. You can also define workspace-level instructions via a .roo/rules-{modeSlug}/ directory at your project root with markdown files inside. Having different modes allows developers to quickly fine-tune how the Roo Code agent performs its tasks. Roo ships out-of-the-box with some default modes: Code Mode, Architect Mode, Ask Mode, Debug Mode, and Orchestrator Mode. These can get you far, but I have expanded on this list with a few custom modes I have made for specific scenarios I run into every day as a software engineer. My Custom Modes 📜 Documenter Mode I created this mode to help me with generating documentation for legacy codebases my team works with. I use this mode to help produce documentation interactively with me while I read a codebase. Mode Definition You are Roo, a highly skilled technical documentation writer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer, and your responsibility is to provide documentation around the code you are working on. You will be asked to provide documentation in the form of comments, markdown files, or other formats as needed. Mode-specific Instructions You will respect the following rules: * You will not write any code, only markdown files. * In your documentation, you will provide references to specific files and line numbers of code you are referencing. * You will not attempt to execute any commands. * You will not attempt to run the application in the browser. * You will only look at the code and infer functionality from that. 👥 Pair Programmer Mode I created a “Pair Programmer” mode to serve as my personal coding partner. It’s designed to work in a more collaborative way with a human software engineer. When I want to explore multiple ideas quickly, I switch to this mode to rapidly iterate on code with Roo. In this setup, I take on the role of the navigator—guiding direction, strategy, and decisions—while Roo handles the “driving” by writing and testing the code we need. Mode Definition You are Roo, a highly skilled software engineer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer who will be checking your work and providing instructions. If you get stuck, ask for help and we will solve problems together. Mode-specific Instructions You will respect the following rules: * You will not install new 3rd party libraries without first providing usage metrics (stars, downloads, latest version update date). * You will not do any additional tasks outside of what you have been told to do. * You will not assume to do any additional work outside of what you have been instructed to do. * You will not open the browser and test the application. Your pairing partner will do that for you. * You will not attempt to open the application or the URL at which the application is running. Assume your pairing partner will do that for you. * You will not attempt to run npm run dev or similar commands. Your pairing partner will do that for you. * You will not attempt to run a development server of any kind. Your pairing partner will handle that for you. * You will not write tests unless instructed to. * You will not make any git commits unless explicitly told to do so. * You will not make suggestions of commands to run the software or execute the test suite. Assume that your human counterpart has the application running and will check your work. 🧑‍🏫 Project Manager I created this mode to help me write tasks for my team with clear and actionable acceptance criteria. Mode Definition You are a professional project manager. You are highly skilled in breaking down large tasks into bite-sized pieces that are actionable by an engineering team or an LLM performing engineering tasks. You analyze features carefully and detail out all edge cases and scenarios so that no detail is missed. Mode-specific Instructions Think creatively about how to detail out features. Provide a technical and business case explanation about feature value. Break down features and functionality in the following way. The following example would be for user login: User Login: As a user, I can log in to the application so that I can make changes. This prevents anonymous individuals from accessing the admin panel. Acceptance Criteria * On the login page, I can fill in my email address: * This field is required. * This field must enforce email format validation. * On the login page, I can fill in my password: * This field is required. * The input a user types into this field is hidden. * On failure to log in, I am provided an error dialog: * The error dialog should be the same if the email exists or not so that bad actors cannot glean info about active user accounts in our system. * Error dialog should be a red box pinned to the top of the page. * Error dialog can be dismissed. * After 4 failed login attempts, the form becomes locked: * Display a dialog to the user letting them know they can try again in 30 minutes. * Form stays locked for 30 minutes and the frontend will not accept further submissions. 🦾 Agent Consultant I created this mode for assistance with modifying my existing Roo modes and rules files as well as generating higher quality prompts for me. This mode leverages the Context7 MCP to keep up-to-date with documentation on Roo Code and prompt engineering best practices. Mode Definition You are an AI Agent coding expert. You are proficient in coding with agents and defining custom rules and guidelines for AI powered coding agents. Your specific expertise is in the Roo Code tool for VS Code are you are exceptionally capable at creating custom rules files and custom mode. This is your workflow that you should always follow: 1. 1. Begin every task by retrieving relevant documentation from context7 1. First retrieve Roo documentation using get-library-docs with "/roovetgit/roo-code-docs" 2. Then retrieve prompt engineering best practices using get-library-docs with “/dair-ai/prompt-engineering-guide" 2. Reference this documentation explicitly in your analysis and recommendations 3. Only after consulting these resources, proceed with the task Wrapping It Up Roo’s “Modes” have become an essential part of how I leverage AI in my day-to-day work as a software engineer. By tailoring each mode to specific tasks—whether it’s generating documentation, pairing on code, writing project specs, or improving prompt quality—I’ve been able to streamline my workflow and get more done with greater clarity and precision. Roo’s flexibility lets me define how it should behave in different contexts, giving me fine-grained control over how I interact with AI in my coding environment. Roo also has the capability of defining custom modes per project if that is needed by your team. If you find yourself repeating certain workflows or needing more structure in your interactions with AI tools, I highly recommend experimenting with your own custom modes. The payoff in productivity and developer experience is absolutely worth it....

Jun 13, 2025

6 mins

AIRoo Code

What Sets the Best Autonomous Coding Agents Apart?

Must-have Features of Coding Agents Autonomous coding agents are no longer experimental, they are becoming an integral part of modern development workflows, redefining how software is built and maintained. As models become more capable, agents have become easier to produce, leading to an explosion of options with varying depth and utility. Drawing insights from our experience using many agents, let's delve into the features that you'll absolutely want to get the best results. 1. Customizable System Prompts Custom agent modes, or roles, allow engineers to tailor the outputs to the desired results of their task. For instance, an agent can be set to operate in a "planning mode" focused on outlining development steps and gathering requirements, a "coding mode" optimized for generating and testing code, or a "documentation mode" emphasizing clarity and completeness of written artifacts. You might start with the off-the-shelf planning prompt, but you'll quickly want your own tailored version. Regardless of which modes are included out of the box, the ability to customize and extend them is critical. Agents must adapt to your unique workflows and prioritize what's important to your project. Without this flexibility, even well-designed defaults can fall short in real-world use. Engineers have preferences, and projects contain existing work. The best agents offer ways to communicate these preferences and decisions effectively. For example, 'pnpm' instead of 'npm' for package management, requiring the agent to seek root causes rather than offer temporary workarounds, or mandating that tests and linting must pass before a task is marked complete. Rules are a layer of control to accomplish this. Rules reinforce technical standards but also shape agent behavior to reflect project priorities and cultural norms. They inform the agent across contexts, think constraints, preferences, or directives that apply regardless of the task. Rules can encode things like style guidelines, risk tolerances, or communication boundaries. By shaping how the agent reasons and responds, rules ensure consistent alignment with desired outcomes. Roo code is an agent that makes great use of custom modes, and rules are ubiquitous across coding agents. These features form a meta-agent framework that allows engineers to construct the most effective agent for their unique project and workflow details. 2. Usage-based Pricing The best agents provide as much relevant information as possible to the model. They give transparency and control over what information is sent. This allows engineers to leverage their knowledge of the project to improve results. Being liberal with relevant information to the models is more expensive however, it also significantly improves results. The pricing model of some agents prioritizes fixed, predictable costs that include model fees. This creates an incentive to minimize the amount of information sent to the model in order to control costs. To get the most out of these tools, you’ve got to get the most out of models, which typically implies usage-based pricing. 3. Autonomous Workflows The way we accomplish work has phases. For example, creating tests and then making them pass, creating diagrams or plans, or reviewing work before submitting PRs. The best agents have mechanisms to facilitate these phases in an autonomous way. For the best results, each phase should have full use of a context window without watering down the main session's context. This should leverage your custom modes, which excel at each phase of your workflow. 4. Working in the Background The best agents are more effective at producing desired results and thus are able to be more autonomous. As agents become more autonomous, the ability to work in the background or work on multiple tasks at once becomes increasingly necessary to unlock their full potential. Agents that leverage local or cloud containers to perform work independently of IDEs or working copies on an engineer's machine further increase their utility. This allows engineers to focus on drafting plans and reviewing proposed changes, ultimately to work toward managing multiple tasks at once, overseeing their agent-powered workflows as if guiding a team. 5. Integrations with your Tools The Model Context Protocol (MCP) serves as a standardized interface, allowing agents to interact with your tools and data sources. The best agents seamlessly integrate with the platforms that engineers rely on, such as Confluence for documentation, Jira for tasks, and GitHub for source control and pull requests. These integrations ensure the agent can participate meaningfully across the full software development lifecycle. 6. Support for Multiple Model Providers Reliance on a single AI provider can be limiting. Top-tier agents support multiple providers, allowing teams to choose the best models for specific tasks. This flexibility enhances performance, the ability to use the latest and greatest, and also safeguards against potential downtimes or vendor-specific issues. Final Thoughts Selecting the right autonomous coding agent is a strategic decision. By prioritizing the features mentioned, technology leaders can adopt agents that can be tuned for their team's success. Tuning agents to projects and teams takes time, as does configuring the plumbing to integrate well with other systems. However, unlocking massive productivity gains is worth the squeeze. Models will become better and better, and the best agents capitalize on these improvements with little to no added effort. Set your organization and teams up to tap into the power of AI-enhanced engineering, and be more effective and more competitive....

Jun 3, 2025

4 mins

Next.js Rendering Strategies and how they affect core web vitals

When it comes to building fast and scalable web apps with Next.js, it’s important to understand how rendering works, especially with the App Router. Next.js organizes rendering around two main environments: the server and the client. On the server side, you’ll encounter three key strategies: Static Rendering, Dynamic Rendering, and Streaming. Each one comes with its own set of trade-offs and performance benefits, so knowing when to use which is crucial for delivering a great user experience. In this post, we'll break down each strategy, what it's good for, and how it impacts your site's performance, especially Core Web Vitals. We'll also explore hybrid approaches and provide practical guidance on choosing the right strategy for your use case. What Are Core Web Vitals? Core Web Vitals are a set of metrics defined by Google that measure real-world user experience on websites. These metrics play a major role in search engine rankings and directly affect how users perceive the speed and smoothness of your site. * Largest Contentful Paint (LCP): This measures loading performance. It calculates the time taken for the largest visible content element to render. A good LCP is 2.5 seconds or less. * Interaction to Next Paint (INP): This measures responsiveness to user input. A good INP is 200 milliseconds or less. * Cumulative Layout Shift (CLS): This measures the visual stability of the page. It quantifies layout instability during load. A good CLS is 0.1 or less. If you want to dive deeper into Core Web Vitals and understand more about their impact on your website's performance, I recommend reading this detailed guide on New Core Web Vitals and How They Work. Next.js Rendering Strategies and Core Web Vitals Let's explore each rendering strategy in detail: 1. Static Rendering (Server Rendering Strategy) Static Rendering is the default for Server Components in Next.js. With this approach, components are rendered at build time (or during revalidation), and the resulting HTML is reused for each request. This pre-rendering happens on the server, not in the user's browser. Static rendering is ideal for routes where the data is not personalized to the user, and this makes it suitable for: * Content-focused websites: Blogs, documentation, marketing pages * E-commerce product listings: When product details don't change frequently * SEO-critical pages: When search engine visibility is a priority * High-traffic pages: When you want to minimize server load How Static Rendering Affects Core Web Vitals * Largest Contentful Paint (LCP): Static rendering typically leads to excellent LCP scores (typically < 1s). The Pre-rendered HTML can be cached and delivered instantly from CDNs, resulting in very fast delivery of the initial content, including the largest element. Also, there is no waiting for data fetching or rendering on the client. * Interaction to Next Paint (INP): Static rendering provides a good foundation for INP, but doesn't guarantee optimal performance (typically ranges from 50-150 ms depending on implementation). While Server Components don't require hydration, any Client Components within the page still need JavaScript to become interactive. To achieve a very good INP score, you will need to make sure the Client Components within the page is minimal. * Cumulative Layout Shift (CLS): While static rendering delivers the complete page structure upfront which can be very beneficial for CLS, achieving excellent CLS requires additional optimization strategies: * Static HTML alone doesn't prevent layout shifts if resources load asynchronously * Image dimensions must be properly specified to reserve space before the image loads * Web fonts can cause text to reflow if not handled properly with font display strategies * Dynamically injected content (ads, embeds, lazy-loaded elements) can disrupt layout stability * CSS implementation significantly impacts CLS—immediate availability of styling information helps maintain visual stability Code Examples: 1. Basic static rendering: ` 2. Static rendering with revalidation (ISR): ` 3. Static path generation: ` 2. Dynamic Rendering (Server Rendering Strategy) Dynamic Rendering generates HTML on the server for each request at request time. Unlike static rendering, the content is not pre-rendered or cached but freshly generated for each user. This kind of rendering works best for: * Personalized content: User dashboards, account pages * Real-time data: Stock prices, live sports scores * Request-specific information: Pages that use cookies, headers, or search parameters * Frequently changing data: Content that needs to be up-to-date on every request How Dynamic Rendering Affects Core Web Vitals * Largest Contentful Paint (LCP): With dynamic rendering, the server needs to generate HTML for each request, and that can't be fully cached at the CDN level. It is still faster than client-side rendering as HTML is generated on the server. * Interaction to Next Paint (INP): The performance is similar to static rendering once the page is loaded. However, it can become slower if the dynamic content includes many Client Components. * Cumulative Layout Shift (CLS): Dynamic rendering can potentially introduce CLS if the data fetched at request time significantly alters the layout of the page compared to a static structure. However, if the layout is stable and the dynamic content size fits within predefined areas, the CLS can be managed effectively. Code Examples: 1. Explicit dynamic rendering: ` 2. Simplicit dynamic rendering with cookies: ` 3. Dynamic routes: ` 3. Streaming (Server Rendering Strategy) Streaming allows you to progressively render UI from the server. Instead of waiting for all the data to be ready before sending any HTML, the server sends chunks of HTML as they become available. This is implemented using React's Suspense boundary. React Suspense works by creating boundaries in your component tree that can "suspend" rendering while waiting for asynchronous operations. When a component inside a Suspense boundary throws a promise (which happens automatically with data fetching in React Server Components), React pauses rendering of that component and its children, renders the fallback UI specified in the Suspense component, continues rendering other parts of the page outside this boundary, and eventually resumes and replaces the fallback with the actual component once the promise resolves. When streaming, this mechanism allows the server to send the initial HTML with fallbacks for suspended components while continuing to process suspended components in the background. The server then streams additional HTML chunks as each suspended component resolves, including instructions for the browser to seamlessly replace fallbacks with final content. It works well for: * Pages with mixed data requirements: Some fast, some slow data sources * Improving perceived performance: Show users something quickly while slower parts load * Complex dashboards: Different widgets have different loading times * Handling slow APIs: Prevent slow third-party services from blocking the entire page How Streaming Affects Core Web Vitals * Largest Contentful Paint (LCP): Streaming can improve the perceived LCP. By sending the initial HTML content quickly, including potentially the largest element, the browser can render it sooner. Even if other parts of the page are still loading, the user sees the main content faster. * Interaction to Next Paint (INP): Streaming can contribute to a better INP. When used with React's <Suspense />, interactive elements in the faster-loading parts of the page can become interactive earlier, even while other components are still being streamed in. This allows users to engage with the page sooner. * Cumulative Layout Shift (CLS): Streaming can cause layout shifts as new content streams in. However, when implemented carefully, streaming should not negatively impact CLS. The initially streamed content should establish the main layout, and subsequent streamed chunks should ideally fit within this structure without causing significant reflows or layout shifts. Using placeholders and ensuring dimensions are known can help prevent CLS. Code Examples: 1. Basic Streaming with Suspense: ` 2. Nested Suspense boundaries for more granular control: ` 3. Using Next.js loading.js convention: ` 4. Client Components and Client-Side Rendering Client Components are defined using the React 'use client' directive. They are pre-rendered on the server but then hydrated on the client, enabling interactivity. This is different from pure client-side rendering (CSR), where rendering happens entirely in the browser. In the traditional sense of CSR (where the initial HTML is minimal, and all rendering happens in the browser), Next.js has moved away from this as a default approach but it can still be achievable by using dynamic imports and setting ssr: false. ` Despite the shift toward server rendering, there are valid use cases for CSR: 1. Private dashboards: Where SEO doesn't matter, and you want to reduce server load 2. Heavy interactive applications: Like data visualization tools or complex editors 3. Browser-only APIs: When you need access to browser-specific features like localStorage or WebGL 4. Third-party integrations: Some third-party widgets or libraries that only work in the browser While these are valid use cases, using Client Components is generally preferable to pure CSR in Next.js. Client Components give you the best of both worlds: server-rendered HTML for the initial load (improving SEO and LCP) with client-side interactivity after hydration. Pure CSR should be reserved for specific scenarios where server rendering is impossible or counterproductive. Client components are good for: * Interactive UI elements: Forms, dropdowns, modals, tabs * State-dependent UI: Components that change based on client state * Browser API access: Components that need localStorage, geolocation, etc. * Event-driven interactions: Click handlers, form submissions, animations * Real-time updates: Chat interfaces, live notifications How Client Components Affect Core Web Vitals * Largest Contentful Paint (LCP): Initial HTML includes the server-rendered version of Client Components, so LCP is reasonably fast. Hydration can delay interactivity but doesn't necessarily affect LCP. * Interaction to Next Paint (INP): For Client Components, hydration can cause input delay during page load, and when the page is hydrated, performance depends on the efficiency of event handlers. Also, complex state management can impact responsiveness. * Cumulative Layout Shift (CLS): Client-side data fetching can cause layout shifts as new data arrives. Also, state changes might alter the layout unexpectedly. Using Client Components will require careful implementation to prevent shifts. Code Examples: 1. Basic Client Component: ` 2. Client Component with server data: ` Hybrid Approaches and Composition Patterns In real-world applications, you'll often use a combination of rendering strategies to achieve the best performance. Next.js makes it easy to compose Server and Client Components together. Server Components with Islands of Interactivity One of the most effective patterns is to use Server Components for the majority of your UI and add Client Components only where interactivity is needed. This approach: 1. Minimizes JavaScript sent to the client 2. Provides excellent initial load performance 3. Maintains good interactivity where needed ` Partial Prerendering (Next.js 15) Next.js 15 introduced Partial Prerendering, a new hybrid rendering strategy that combines static and dynamic content in a single route. This allows you to: 1. Statically generate a shell of the page 2. Stream in dynamic, personalized content 3. Get the best of both static and dynamic rendering Note: At the time of this writing, Partial Prerendering is experimental and is not ready for production use. Read more ` Measuring Core Web Vitals in Next.js Understanding the impact of your rendering strategy choices requires measuring Core Web Vitals in real-world conditions. Here are some approaches: 1. Vercel Analytics If you deploy on Vercel, you can use Vercel Analytics to automatically track Core Web Vitals for your production site: ` 2. Web Vitals API You can manually track Core Web Vitals using the web-vitals library: ` 3. Lighthouse and PageSpeed Insights For development and testing, use: * Chrome DevTools Lighthouse tab * PageSpeed Insights * Chrome User Experience Report Making Practical Decisions: Which Rendering Strategy to Choose? Choosing the right rendering strategy depends on your specific requirements. Here's a decision framework: Choose Static Rendering when * Content is the same for all users * Data can be determined at build time * Page doesn't need frequent updates * SEO is critical * You want the best possible performance Choose Dynamic Rendering when * Content is personalized for each user * Data must be fresh on every request * You need access to request-time information * Content changes frequently Choose Streaming when * Page has a mix of fast and slow data requirements * You want to improve perceived performance * Some parts of the page depend on slow APIs * You want to prioritize showing critical UI first Choose Client Components when * UI needs to be interactive * Component relies on browser APIs * UI changes frequently based on user input * You need real-time updates Conclusion Next.js provides a powerful set of rendering strategies that allow you to optimize for both performance and user experience. By understanding how each strategy affects Core Web Vitals, you can make informed decisions about how to build your application. Remember that the best approach is often a hybrid one, combining different rendering strategies based on the specific requirements of each part of your application. Start with Server Components as your default, use Static Rendering where possible, and add Client Components only where interactivity is needed. By following these principles and measuring your Core Web Vitals, you can create Next.js applications that are fast, responsive, and provide an excellent user experience....

May 30, 2025

9 mins

NextJS

Advanced Authentication and Onboarding Workflows with Docusign Extension Apps

Advanced Authentication and Onboarding Workflows with Docusign Extension Apps Docusign Extension Apps are a relatively new feature on the Docusign platform. They act as little apps or plugins that allow building custom steps in Docusign agreement workflows, extending them with custom functionality. Docusign agreement workflows have many built-in steps that you can utilize. With Extension Apps, you can create additional custom steps, enabling you to execute custom logic at any point in the agreement process, from collecting participant information to signing documents. An Extension App is a small service, often running in the cloud, described by the Extension App manifest. The manifest file provides information about the app, including the app's author and support pages, as well as descriptions of extension points used by the app or places where the app can be integrated within an agreement workflow. Most often, these extension points need to interact with an external system to read or write data, which cannot be done anonymously, as all data going through Extension Apps is usually sensitive. Docusign allows authenticating to external systems using the OAuth 2 protocol, and the specifics about the OAuth 2 configuration are also placed in the manifest file. Currently, only OAuth 2 is supported as the authentication scheme for Extension Apps. OAuth 2 is a robust and secure protocol, but not all systems support it. Some systems use alternative authentication schemes, such as the PKCE variant of OAuth 2, or employ different authentication methods (e.g., using secret API keys). In such cases, we need to use a slightly different approach to integrate these systems with Docusign. In this blog post, we'll show you how to do that securely. We will not go too deep into the implementation details of Extension Apps, and we assume a basic familiarity with how they work. Instead, we'll focus on the OAuth 2 part of Extension Apps and how we can extend it. Extending the OAuth 2 Flow in Extension Apps For this blog post, we'll integrate with an imaginary task management system called TaskVibe, which offers a REST API to which we authenticate using a secret API key. We aim to develop an extension app that enables Docusign agreement workflows to communicate with TaskVibe, allowing tasks to be read, created, and updated. TaskVibe does not support OAuth 2. We need to ensure that, once the TaskVibe Extension App is connected, the user is prompted to enter their secret API key. We then need to store this API key securely so it can be used for interacting with the TaskVibe API. Of course, the API key can always be stored in the database of the Extension App. Sill, then, the Extension App has a significant responsibility for storing the API key securely. Docusign already has the capability to store secure tokens on its side and can utilize that instead. After all, most Extension Apps are meant to be stateless proxies to external systems. Updating the Manifest To extend OAuth 2, we will need to hook into the OAuth 2 flow by injecting our backend's endpoints into the authorization URL and token URL parts of the manifest. In any other external system that supports OAuth 2, we would be using their OAuth 2 endpoints. In our case, however, we must use our backend endpoints so we can emulate OAuth 2 to Docusign. ` The complete flow will look as follows: In the diagram, we have four actors: the end-user on behalf of whom we are authenticating to TaskVibe, DocuSign, the Extension App, and TaskVibe. We are only in control of the Extension App, and within the Extension App, we need to adhere to the OAuth 2 protocol as expected by Docusign. 1. In the first step, Docusign will invoke the /authorize endpoint of the Extension App and provide the state, client_id, and redirect_uri parameters. Of these three parameters, state and redirect_uri are essential. 2. In the /authorize endpoint, the app needs to store state and redirect_uri, as they will be used in the next step. It then needs to display a user-facing form where the user is expected to enter their TaskVibe API key. 3. Once the user submits the form, we take the API key and encode it in a JWT token, as we will send it over the wire back to Docusign in the form of the code query parameter. This is the "custom" part of our implementation. In a typical OAuth 2 flow, the code is generated by the OAuth 2 server, and the client can then use it to request the access token. In our case, we'll utilize the code to pass the API key to Docusign so it can send it back to us in the next step. Since we are still in control of the user session, we redirect the user to the redirect URI provided by Docusign, along with the code and the state as query parameters. 4. The redirect URI on Docusign will display a temporary page to the user, and in the background, attempt to retrieve the access token from our backend by providing the code and state to the /api/token endpoint. 5. The /api/token endpoint takes the code parameter and decodes it to extract the TaskVibe API secret key. It can then verify if the API key is even valid by making a dummy call to TaskVibe using the API key. If the key is valid, we encode it in a new JWT token and return it as the access token to Docusign. 6. Docusign stores the access token securely on its side and uses it when invoking any of the remaining extension points on the Extension App. By following the above step, we ensure that the API key is stored in an encoded format on Docusign, and the Extension App effectively extends the OAuth 2 flow. The app is still stateless and does not have the responsibility of storing any secure information locally. It acts as a pure proxy between Docusign and TaskVibe, as it's meant to be. Writing the Backend Most Extension Apps are backend-only, but ours needs to have a frontend component for collecting the secret API key. A good fit for such an app is Next.js, which allows us to easily set up both the frontend and the backend. We'll start by implementing the form for entering the secret API key. This form takes the state, client ID, and redirect URI from the enclosing page, which takes these parameters from the URL. The form is relatively simple, with only an input field for the API key. However, it can also be used for any additional onboarding questions. If you will ever need to store additional information on Docusign that you want to use implicitly in your Extension App workflow steps, this is a good place to store it alongside the API secret key on Docusign. ` Submitting the form invokes a server action on Next.js, which takes the entered API key, the state, and the redirect URI. It then creates a JWT token using Jose that contains the API key and redirects the user to the redirect URI, sending the JWT token in the code query parameter, along with the state query parameter. This JWT token can be short-lived, as it's only meant to be a temporary holder of the API key while the authentication flow is running. This is the server action: ` After the user is redirected to Docusign, Docusign will then invoke the /api/token endpoint to obtain the access token. This endpoint will also be invoked occasionally after the authentication flow, before any extension endpoint is invoked, to get the latest access token using a refresh token. Therefore, the endpoint needs to cover two scenarios. In the first scenario, during the authentication phase, Docusign will send the code and state to the /api/token endpoint. In this scenario, the endpoint must retrieve the value of the code parameter (storing the JWT value), parse the JWT, and extract the API key. Optionally, it can verify the API key's validity by invoking an endpoint on TaskVibe using that key. Then, it should return an access token and a refresh token back to Docusign. Since we are not using refresh tokens in our case, we can create a new JWT token containing the API key and return it as both the access token and the refresh token to Docusign. In the second scenario, Docusign will send the most recently obtained refresh token to get a new access token. Again, because we are not using refresh tokens, we can return both the retrieved access token and the refresh token to Docusign. The api/token endpoint is implemented as a Next.js route handler: ` In all the remaining endpoints defined in the manifest file, Docusign will provide the access token as the bearer token. It's up to each endpoint to then read this value, parse the JWT, and extract the secret API key. Conclusion In conclusion, your Extension App does not need to be limited by the fact that the external system you are integrating with does not have OAuth 2 support or requires additional onboarding. We can safely build upon the existing OAuth 2 protocol and add custom functionality on top of it. This is also a drawback of the approach - it involves custom development, which requires additional work on our part to ensure all cases are covered. Fortunately, the scope of the Extension App does not extend significantly. All remaining endpoints are implemented in the same manner as any other OAuth 2 system, and the app remains a stateless proxy between Docusign and the external system as all necessary information, such as the secret API key and other onboarding details, is stored as an encoded token on the Docusign side. We hope this blog post was helpful. Keep an eye out for more Docusign content soon, and if you need help building an Extension App of your own, feel free to reach out. The complete source code for this project is available on StackBlitz....

Jul 4, 2025

8 mins

Docusign

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

ChatGPT can't solve these problems! with Chris Gardner

You might also like

Roo Custom Modes

What Sets the Best Autonomous Coding Agents Apart?

Next.js Rendering Strategies and how they affect core web vitals

Advanced Authentication and Onboarding Workflows with Docusign Extension Apps

Let's innovate together!

You might also like

Roo Custom Modes

What Sets the Best Autonomous Coding Agents Apart?

Next.js Rendering Strategies and how they affect core web vitals

Advanced Authentication and Onboarding Workflows with Docusign Extension Apps