Skip to content

Efficiently Extract Object References in Shopify Storefront GraphQL API

Efficiently Extract Object References in Shopify Storefront GraphQL API

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

Efficiently Extract Object References in Shopify Storefront GraphQL API

Introduction

So, this blog post is born out of necessity and a bit of frustration. If you're diving into the world of Shopify's Storefront API, you've probably realized that while it's powerful, extracting data in the object reference from Metadata fields or Metaobjects (in the GraphQL query) can be a bit like searching for a needle in a haystack. This complexity often arises not from the API's lack of capabilities but from the sparse and sometimes unclear documentation on this specific aspect.

That's precisely why I decided to create this post. As a developer, I found myself in a situation where the documentation and community resources were either scarce or not detailed enough for the specific challenges I faced. This guide is the result of my journey - from confusion to clarity.

The Situation

To understand the crux of my challenge, it's essential to recognize that creating metafields and metaobjects is a common practice for those seeking a more customized and controlled experience with Shopify CMS. In my specific case, I wanted to enrich the information available for each product's vendor beyond what Shopify typically allows, which is just a single text box. I aimed to have each vendor display their name and two versions of their logo: a themed logo that aligns with my website's color scheme and an original logo for use on specific pages.

configuration for the custom vendor metaobject
details of the original logo field in the vendor object - the vendor is a file image field

The challenge emerged when I fetched a list of all vendors to display on a page. My GraphQL query for the Storefront API looked like this:

query vendorsMetaObjects($country: CountryCode, $language: LanguageCode)
@inContext(country: $country, language: $language) {
    vendorsCollection: metaobjects(type: "vendor", first: ${MAX_PAGE_BY}) {
        nodes {
            name: field(key: "name") {
                value
            }
            originalLogo: field(key: "original_logo") {
                value
            }
            themedLogo: field(key: "themed_logo") {
                value
            }
        }
    }
}

This was when I hit a roadblock. How do I fetch a field with a more complex type than a simple text or number, like an image? To retrieve the correct data, what specific details must I include in the originalLogo and themedLogo fields?

In my quest for a solution, I turned to every resource I could think of. I combed through the Storefront API documentation, searched endlessly on Stack Overflow, and browsed various tech forums. Despite all these efforts, I couldn’t find the clear, detailed answers I needed. It felt like I was looking for something that should be there but wasn’t.

Solution

Before diving into the solution, it's important to note that this is the method I discovered through trial and error. There might be other approaches, but I want to share the process that worked for me without clear documentation.

My first step was to understand the nature of the data returned by the Storefront API. I inspected the value of a metaobject, which looked something like this:

{
  "name": { "value": "A Vendor Test 1" },
  "originalLogo": { "value": "gid://shopify/MediaImage/some_ID" },
  "themedLogo": { "value": "gid://shopify/MediaImage/some_other_ID" }
}

The key here was the gid, or global unique identifier. What stood out was that it always includes the object type, in this case, MediaImage. This was crucial because it indicated which union to use and what properties to query from this object in the Storefront API documentation.

So, I modified my query to include a reference to this object type, focusing on the originalLogo field as an example:

query vendorsMetaObjects($country: CountryCode, $language: LanguageCode)
@inContext(country: $country, language: $language) {
    vendorsCollection: metaobjects(type: "vendor", first: ${MAX_PAGE_BY}) {
        nodes {
            # ...
            originalLogo: field(key: "original_logo") {
                value
                reference {
                    ... on MediaImage {
                        # Explore MediaImage documentation for extractable fields
                    }
                }
            }
            # ...
        }
    }
}

The next step was to consult the Storefront API documentation for MediaImage at Shopify API Documentation. Here, I discovered the image field within MediaImage, an object containing the url field. With this information, I updated my query:

query vendorsMetaObjects($country: CountryCode, $language: LanguageCode)
@inContext(country: $country, language: $language) {
    vendorsCollection: metaobjects(type: "vendor", first: ${MAX_PAGE_BY}) {
        nodes {
            name: field(key: "name") {
                value
            }
            originalLogo: field(key: "original_logo") {
                reference {
                    ... on MediaImage {
                        image {
                            url
                        }
                    }
                }
            }
            themedLogo: field(key: "themed_logo") {
                reference {
                    ... on MediaImage {
                        image {
                            url
                        }
                    }
                }
            }
        }
    }
}

Finally, when executing this query, the output for a single object was as follows:

{
  "name": { "value": "A Vendor Test 1" },
  "originalLogo": {
    "reference": {
      "image": {
        "url": "https://cdn.shopify.com/s/files/rest_of_the_url"
      }
    }
  },
  "themedLogo": {
    "reference": {
      "image": {
        "url": "https://cdn.shopify.com/s/files/rest_of_the_url"
      }
    }
  }
}

Through this process, I successfully extracted the necessary data from the object references in the metafields, specifically handling more complex data types like images.

Conclusion

In wrapping up, it's vital to emphasize that while this guide focused on extracting MediaImage data from Shopify's Storefront API, the methodology I've outlined is broadly applicable. The key is understanding the structure of the gid (global unique identifier) and using it to identify the correct object types within your GraphQL queries.

Whether you're dealing with images or any other data type defined in Shopify's Storefront API, this approach can be your compass. Dive into the API documentation, identify the object types relevant to your needs, and adapt your queries accordingly. It's a versatile strategy that can be tailored to suit many requirements.

Remember, the world of APIs and e-commerce is constantly evolving, and staying adaptable and resourceful is crucial. This journey has been a testament to the power of perseverance and creative problem-solving in the face of technical challenges. May your ventures into Shopify's Storefront API be equally rewarding and insightful.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

You might also like

How to Leverage Apollo Client Fetch Policies Like the Pros cover image

How to Leverage Apollo Client Fetch Policies Like the Pros

Apollo Client provides a rich ecosystem and cache for interfacing with your GraphQL APIs. You write your query and leverage the useQuery hook to fetch your data. It provides you with some state context and eventually resolves your query. That data is stored in a local, normalized, in-memory cache, which allows Apollo Client to respond to most previously run requests near instantaneously. This has huge benefits for client performance and the feel of your apps. However, sometimes Apollo's default doesn't match the user experience you want to provide. They provide fetch policies to allow you to control this behavior on each query you execute. In this article, we'll explore the different fetch policies and how you should leverage them in your application. cache-first This is the default for Apollo Client. Apollo will execute your query against the cache. If the cache can fully fulfill the request, then that's it, and we return to the client. If it can only partially match your request or cannot find any of the related data, the query will be run against your GraphQL server. The response is cached for the next query and returned to the handler. This method prioritizes minimizing the number of requests sent to your server. However, it has an adverse effect on data that changes regularly. Think of your social media feeds - they typically contain constantly changing information as new posts are generated. Or a real-time dashboard app tracking data as it moves through a system. cache-first is probably not the best policy for these use cases as you won't fetch the latest data from the upstream source. You can lower the cache time of items for the dashboard to avoid the staleness issue and still minimize the requests being made, but this problem will persist for social media feeds. The cache-first policy should be considered for data that does not change often in your system or data that the current user fully controls. Data that doesn't change often is easily cached, and that's a recommended pattern. For data that the user controls, we need to consider how that data changes. If only the current user can change it, we have 2 options: Return the updated data in the response of any mutation which is used to update the cache Use cache invalidation methods like refetchQueries or onQueryUpdated These methods will ensure that our cache stays in sync with our server allowing the policy to work optimally. However, if other users in the system can make changes that impact the current user's view, then we can not invalidate the cache properly using these strategies which makes this policy unideal. network-only This policy skips the cache lookup and goes to the server to fetch the results. The results are stored in the cache for other operations to leverage. Going back to the example I gave in my explanation cache-first of a social media feed, the network-only policy would be a great way to implement the feed itself as it's ever-changing, and we'll likely even want to poll for changes every 10s or so. The following is an example of what this component could look like: ` Whenever this SocialFeed component is rendered, we always fetch the latest results from the GraphQL server ensuring we're looking at the current data. The results are put in the cache which we can leverage in some children components. cache-only cache-only only checks the cache for the requested data and never hits the server. It throws an error if the specified cache items cannot be found. At first glance, this cache policy may seem unhelpful because it's unclear if our cache is seeded with our data. However, in combination with the network-only policy above, this policy becomes helpful. This policy is meant for components down tree from network-only level query. This method is for you if you're a fan of React components' compatibility. We can modify the return of our previous example to be as follows: ` Notice we're not passing the full post object as a prop. This simplifies our Post component types and makes later refactors easier. The Post would like like the following: ` In this query, we're grabbing the data directly from our cache every time because our top-level query should have fetched it. Now, a small bug here makes maintainability a bit harder. Our top-level GetFeed query doesn't guarantee fetching the same fields. Notice how our Post component exports a fragment. Fragments are a feature Apollo supports to share query elements across operations. In our SocialFeed component, we can change our query to be: ` Now, as we change our Post to use new fields and display different data, the refactoring is restricted to just that component, and the upstream components will detect the changes and handle them for us making our codebase more maintainable. Because the upstream component is always fetching from the network, we can trust that the cache will have our data, making this component safe to render. With these examples, though, our users will likely have to see a loading spinner or state on every render unless we add some server rendering. cache-and-network This is where cache-and-network comes to play. With this policy, Apollo Client will run your query against your cache and your GraphQL server. This further simplifies our example above if we want to provide the last fetched results to the user but then update the feed immediately upon gathering the latest data. This is similar to what X/Twitter does when you reload the app. You'll see the last value that was in the cache then it'll render the network values when ready. This can cause a jarring user experience though, if the data is changing a lot over time, so I recommend using this methodology sparsely. However, if you wanted to update our existing example, we'd just change our SocialFeed component to use this policy, and that'll keep our client and server better in sync while still enabling 10s polling. no-cache This policy is very similar to the network-only policy, except it bypasses the local cache entirely. In our previous example, we wrote engagement as a sub-selector on a Post and stored fields there. These metrics can change in real time pretty drastically. Chat features, reactions, viewership numbers, etc., are all types of data that may change in real time. The no-cache policy is good when this type of data is active, such as during a live stream or within the first few hours of a post going out. You may typically want to use the cache-and-network policy eventually but during that active period, you'll probably want to use no-cache so your consumers can trust your data. I'd probably recommend changing your server to split these queries and run different policies for the operations for performance reasons. I haven't mentioned this yet, but you can make the fetch policy on a query dynamic, meaning you combine these different policies' pending states. This could look like the following: ` We pass whether the event is live to the component that then leverages that info to determine if we should cache or not when fetching the chat. That being said, we should consider using subscription operations for this type of feature as well, but that's an exercise for another blog post. standby This is the most uncommon fetch policy, but has a lot of use. This option runs like a cache-first query when it executes. However, by default, this query does not run and is treated like a "skip" until it is manually triggered by a refetch or updateQueries caller. You can achieve similar results by leveraging the useLazyQuery operator, but this maintains the same behavior as other useQuery operators so you'll have more consistency among your components. This method is primarily used for operations pending other queries to finish or when you want to trigger the caller on a mutation. Think about a dashboard with many filters that need to be applied before your query executes. The standby fetch policy can wait until the user hits the Apply or Submit button to execute the operation then calls a await client.refetchQueries({ include: ["DashboardQuery"] }), which will then allow your component to pull in the parameters for your operation and execute it. Again, you could achieve this with useLazyQuery so it's really up to you and your team how you want to approach this problem. To avoid learning 2 ways, though, I recommend picking just one path. Conclusion Apollo Client's fetch policies are a versatile and helpful tool for managing your application data and keeping it in sync with your GraphQL server. In general, you should use the defaults provided by the library, but think about the user experience you want to provide. This will help you determine which policy best meets your needs. Leveraging tools like fragments will enable you to manage your application and use composable patterns more effectively. With the rise of React Server Components and other similar patterns, you'll need to be wary of how that impacts your Apollo Client strategy. However, if you're on a legacy application that leverages traditional SSR patterns, Apollo allows you to pre-render queries on the server and their related cache. When you combine these technologies, you'll find that your apps perform great, and your users will be delighted....

How to create and use custom GraphQL Scalars cover image

How to create and use custom GraphQL Scalars

How to create and use custom GraphQL Scalars In the realm of GraphQL, scalars form the bedrock of the type system, representing the most fundamental data types like strings, numbers, and booleans. As explored in our previous post, "Leveraging GraphQL Scalars to Enhance Your Schema," scalars play a pivotal role in defining how data is structured and validated. But what happens when the default scalars aren't quite enough? What happens when your application demands a data type as unique as its requirements? Enter the world of custom GraphQL scalars. These data types go beyond the conventional, offering the power and flexibility to tailor your schema to precisely match your application's unique needs. Whether handling complex data structures, enforcing specific data formats, or simply bringing clarity to your API, custom scalars open up a new realm of possibilities. In this post, we'll explore how to understand, create, and effectively utilize custom scalars in GraphQL. From conceptualization to implementation, we'll cover the essentials of extending your GraphQL toolkit, empowering you to transform abstract ideas into concrete, practical solutions. So, let's embark together on the journey of understanding and utilizing custom GraphQL scalars, enhancing and expanding the capabilities of your GraphQL schema. Understanding Custom Scalars Custom scalars in GraphQL extend beyond basic types like String or Int, allowing data to be defined, validated, and processed more precisely. They're instrumental when default types don't quite capture the complexity or specificity of the data, such as with specialized date formats or unique identifiers. The use of custom scalars brings several benefits: * Enhanced Clarity: They offer a clearer representation of what data looks like and how it behaves. * Built-in Validation: Data integrity is bolstered at the schema level. * Flexibility: They can be tailored to specific data handling needs, making your schema more adaptable and robust. With this understanding, we'll explore creating and integrating custom scalars into a GraphQL schema, turning theory into practice. Creating a Custom Scalar Defining a Custom Scalar in TypeScript: Creating a custom scalar in GraphQL with TypeScript involves defining its behavior through parsing, serialization, and validation functions. * Parsing: Transforms input data from the client into a server-understandable format. * Serializing: Converts server data back to a client-friendly format. * Validation: Ensures data adheres to the defined format or criteria. Example: A 'Color' Scalar in TypeScript The Color scalar will ensure that every color value adheres to a valid hexadecimal format, like #FFFFFF for white or #000000 for black: ` In this TypeScript implementation: * validateColors: a function that checks if the provided string matches the hexadecimal color format. * parseValue: a method function that converts the scalar’s value from the client into the server’s representation format - this method is called when a client provides the scalar as a variable. See parseValue docs for more information * serialize: a method function that converts the scalar’s server representation format to the client format, see serialize docs for more information * parseLiteral: similar to parseValue, this method function converts the scalar’s value from the client to the server’s representation format. Still, this method is called when the scalar is provided as a hard-coded argument (inline). See parseLiteral docs for more information In the upcoming section, we'll explore how to incorporate and validate these custom scalars within your schema, ensuring they function seamlessly in real-world scenarios. Integrating Custom Scalars into a Schema Incorporating the 'Color' Scalar After defining your custom Color scalar, the next crucial step is effectively integrating it into your GraphQL schema. This integration ensures that your GraphQL server recognizes and correctly utilizes the scalar. Step-by-Step Integration 1. Add the scalar to Type Definitions: Include the Color scalar in your GraphQL type definitions. This inclusion informs GraphQL about this new scalar type. 2. Resolver Mapping: Map your custom scalar type to its resolver. This connection is key for GraphQL to understand how to process this type during queries and mutations. ` 1. Use the scalar: Update your type to use the new custom scalar ` Testing the Integration With your custom Color scalar integrated, conducting thorough testing is vital. Ensure that your GraphQL server correctly handles the Color scalar, particularly in terms of accepting valid color formats and rejecting invalid ones. For demonstration purposes, I've adapted a creation mutation to include the primaryColor field. To keep this post focused and concise, I won't detail all the code changes here, but the following screenshots illustrate the successful implementation and error handling. Calling the mutation (createTechnology) successfully: Calling the mutation with forced fail (bad color hex): Conclusion The journey into the realm of custom GraphQL scalars reveals a world where data types are no longer confined to the basics. By creating and integrating scalars like the Color type, we unlock precision and specificity in our GraphQL schemas, which significantly enhance our applications' data handling capabilities. Custom scalars are more than just a technical addition; they testify to GraphQL's flexibility and power. They allow developers to express data meaningfully, ensuring that APIs are functional, intuitive, and robust. As we've seen, defining, integrating, and testing these scalars requires a blend of creativity and technical acumen. It encourages a deeper understanding of how data flows through your application and offers a chance to tailor that experience to your project's unique needs. So, as you embark on your GraphQL journey, consider the potential of custom scalars. Whether you're ensuring data integrity, enhancing API clarity, or simply making your schema a perfect fit for your application, the possibilities are as vast as they are exciting. Embrace the power of customization, and let your GraphQL schemas shine!...

Enhancing Your Playwright Workflow: A Guide to the VSCode Extension cover image

Enhancing Your Playwright Workflow: A Guide to the VSCode Extension

Introduction In my last post, Quick Guide to Playwright Fixtures: Enhancing Your Tests, I delved into some of the enhancements we've been implementing in our end-to-end (E2E) tests using Playwright. As I refine our testing strategies, I've come across a tool that has quickly become an essential part of my workflow: the Playwright VSCode extension. If you're like me and constantly looking for ways to streamline testing and debugging, you'll appreciate any tool that can make the process more efficient and enjoyable. That's where this extension comes in. It's not just about writing tests - it's about enhancing the entire development experience. In this post, I'll walk you through getting started with the Playwright VSCode extension, sharing some tips and tricks that have made a real difference in my day-to-day work. Installing the Extension & Basic Setup Before diving into the Playwright VSCode extension, it's essential to have Playwright installed on your machine. If you haven't done so already, you can quickly install it by running: ` This command will set up Playwright and ensure all necessary dependencies are installed. Once Playwright is ready, the VSCode extension will be installed next. Open Visual Studio Code, navigate to the Extensions view by clicking on the Extensions icon in the Activity Bar on the side of the window, and search for “Playwright”. The official extension, ID: ms-playwright.playwright / named: “Playwright Test for VSCode” by Microsoft, should appear at the top of the list. Click "Install," and you're all set. With the extension installed, you can start leveraging its powerful features to enhance your Playwright testing workflow within VSCode. Running the Tests and Identifying Outputs To run a test, simply open the test file in VSCode. The Playwright extension will automatically detect test files and display a "Run" icon next to each test and test suite. You can click on this icon to run individual tests or test suites. Alternatively, you can run all the tests in your project using the Playwright Test Explorer, accessible from the sidebar. Once you start running your tests, the extension provides real-time feedback within the editor. You'll see the status of each test - whether it passes, fails, or is skipped - right next to the corresponding test in your code. This immediate feedback loop is incredibly helpful for catching issues as you write your tests. The output of your tests will be displayed in the VSCode terminal. You'll see detailed information about each test run. Debugging Step-by-Step I find debugging particularly useful when a test fails unexpectedly or when I want to verify that certain actions are being performed as intended. Instead of guessing what might be wrong, I can see exactly what's going on in each test step, making debugging a much more straightforward and less frustrating process. To start debugging, you can easily set a breakpoint in your test file by clicking on the left margin next to the line number where you'd like the execution to pause. Once your breakpoints are in place, you can initiate the debug process by selecting the "Debug" option next to the test you'd like to investigate. Once the debugger is running, the extension allows you to step through your code, inspect variables, and evaluate expressions - all within VSCode. This real-time insight into your test execution is a game-changer, enabling you to pinpoint issues more effectively and confidently refine your tests. Using the Pick Locator Tool Another handy feature is the "Pick Locator" tool. If you've ever struggled with selecting the right element in your tests, this tool can be a time saver. It helps you generate reliable locators by letting you interact directly with the webpage elements you want to target. To use the Pick Locator tool, click the "Pick Locator" button in the Playwright Test Explorer. This will open a new window where you can navigate to the site you're testing. As you hover over elements on the page, the tool will suggest locators, allowing you to select the most appropriate one for your test. While the Pick Locator tool is handy, it’s important to ensure that the locators you generate are robust and maintainable. This is especially true when integrating them with the fixtures I discussed in my previous blog post. Combining the proper locators with well-designed fixtures can create more reliable and reusable test setups, ultimately making your E2E tests more efficient. Conclusion The Playwright VSCode extension has quickly become indispensable in my development workflow. It significantly enhanced my experience of writing and running Playwright tests. Whether you’re just starting with Playwright or looking to optimize your existing tests, this extension offers a range of features that can save you time and effort. Combining these tools with thoughtful test design, such as leveraging fixtures, you can create a more efficient and effective testing process. I hope this guide has given you a good overview of what the Playwright VSCode extension can do and how it can benefit your work. If you haven’t tried it yet, I highly recommend giving it a go. And as always, feel free to explore further and experiment with the features that best suit your needs....

AI Is Speeding Up Development. But Where Are the New Bottlenecks? cover image

AI Is Speeding Up Development. But Where Are the New Bottlenecks?

AI is accelerating development, but it’s also exposing everything else that’s broken. At the Leadership Exchange, leaders unpacked how AI is reshaping the SDLC and what organizations need to address beyond just coding to make adoption successful. Moderated by Rob Ocel, VP of Innovation at This Dot Labs, the panel featured Itai Gerchikov at Anthropic and Harald Kirschner, Principal Product Manager for GitHub Copilot & VS Code at Microsoft. Panelists explored the current state of AI adoption across the software development lifecycle and shared practical insights into how organizations can effectively integrate AI tools. Panelists discussed how companies are investing in AI tools, skills, and managed competency programs to support developers. While AI can dramatically accelerate coding, the panel emphasized that adoption affects every stage of the SDLC. Bottlenecks now appear in testing, DevOps, product delivery, and marketing as AI speeds up development. Organizations that address technical debt and process inefficiencies are better positioned to extract maximum value from AI tools. The conversation also focused on opportunities and risks. Security, governance, and workforce education were highlighted as critical factors for adoption. Panelists stressed that AI initiatives should be aligned with broader business goals rather than pursued in isolation. They noted that companies experimenting at the cutting edge need to consider organizational readiness just as carefully as technical capabilities. Panelists also explored how leading organizations are navigating the early stages of adoption. Those ahead of the curve are using structured experimentation, prioritizing process improvements, and continuously evaluating outcomes to refine their AI strategies. Learning from these early adopters allows other organizations to anticipate emerging trends and prepare for the next phase of AI adoption rather than simply replicating past approaches. Key Takeaways - Investing in AI skills and tools should be done thoughtfully, with clear alignment to business objectives. - Examining the full SDLC helps identify bottlenecks that AI may accelerate or expose. - Organizations can gain a competitive advantage by learning from early adopters and planning for where AI adoption is heading. AI adoption is not just a technical initiative; it is a strategic transformation that requires attention to people, process, and technology. Organizations that balance innovation with operational discipline will be best positioned to capture the full potential of AI across the software lifecycle. Seeing similar challenges in your own SDLC? Let’s compare notes. Join us at an upcoming Leadership Exchange or reach out to continue the conversation. Tracy can be reached at tlee@thisdot.co....

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

Prefer email? hi@thisdot.co