Make sure machines truly see your brand and your product
Myriam says: “Everyone is slowly coming to realise that SEO is changing in many different ways.
My tip is to make sure that your brand and your products are machine-readable, because we're dealing with multimodal search now. That means I can take out my phone, take a picture of your product and say, ‘Is this vegan?’ - and I will get the answer right away.
SEO is entering the outside world, beyond the web.”
What does this mean for both tangible products and less-tangible services, and how do you package what you offer to ensure that it's machine-readable?
“First of all, if you do have products, I have some tips.
Take a picture of your packaging. Make sure that it's machine-readable: the contrast is good, the words come out, etc. That’s very simple, but what if you're a plumber and you don't package anything?
With multimodal search, I could once again take out my phone and take a video of my leaking faucet and ask ChatGPT, Gemini, or Perplexity, ‘Can you find me a plumber nearby that can actually fix this thing? I don't know what it's called. I don't care to know what it's called. I want it fixed.’ The machine will check the video or the picture, try to figure out what it is, and seek out the plumber who has information online saying, ‘This person will fix that specific problem.’
It will find the proper keyword for us, fan out the queries, check a few things, and then say, ‘These are the people you should consider.’ Therefore, if you are a plumber, you should make sure that people leave you reviews that say, ‘My faucet was leaking and this person fixed it. I had tried everything else. It was a complex thing. They did it.’
If you have your own website, you should also think about how customers look for you and in what context. Many years ago, I had a client who did emergency heater fixes. You went on the website, and you had to select if you were a business or residential, then you had to figure out what type of heating you had, and then finally, you could contact them. There was no option for, ‘I have an emergency. It's minus 40 degrees in Canada. I need you to come and fix the heat.’
That's something that brands, and even small local businesses, need to consider. Think about your customers and how they think, and try to match that.”
How do we optimize for these types of phrases?
“If you're a plumber, you can have pictures of leaky faucets on your website, and text around that describing what it is and how your service can actually help with that.
You can have a picture, but you should have an explanation as to what you fix as well. That is very important. Bonus points if you are on YouTube or TikTok showing these things and saying, ‘Here’s the issue you have, this is what it’s called, here’s how you can try to fix it yourself, but if you have problems, call me.’ Obviously, users will see that, decide that they don’t want to go under their sink themselves, and then call you.
Don’t overthink it. If you are already native to TikTok or you know how to do video content, it doesn’t need to be perfect. People just want to see what it's like so they can decide, do I do it myself? Do I call someone? How bad is it? You want to be there when people are starting to muddle through.
What I find fascinating with multimodal is that an image, a video, text, and audio are treated as a universal language for an LLM. They will break it down and add it in. This is what companies need to think about.
One extra tip I would give is to take a picture of your store (if you have one) or your logo, and upload that onto an LLM. Then, ask it, ‘What can you tell me about this business?’ That's how you can see whether your footprint is big enough, and whether you are actually understood or contextualised by machines or not.”
If it doesn’t know who you are, could you ask it how to make your brand more prominent within its results?
“One of the easiest tips for that is something that Crystal Carter popularised, which is to click on the little thumbs down button and explain why it's wrong. Correct it. I've had to do it myself, and it works. I'm not saying it's magical, but it does have an impact.
The second thing is, again, make sure your brand is machine-readable. Maybe your logo is so complicated to parse that nothing shows up. Maybe your footprint is too small because you've just launched, so it doesn't know you. In that case, get your logo out there.
It's basic SEO. Make sure that you're linked in specific places. If you do conferences, for example, share those decks. If you're sponsoring something, make sure your logo is visible, etc.
Something that I say is, ‘Your product is now your landing page,’ which scares some SEOs.
You're not a designer, but as an SEO, you should know how OCR (Optical Character Recognition) works, which is what these LLMs are relying on. You need to make sure that it's contrasted enough that it can be seen easily. You need to pick fonts that are sans-serif and very clean so they can be interpreted. You also need to figure out whether your packaging is so shiny that it doesn’t show up when you take a picture of it.
Last but not least, I don't like them, but QR codes really help.”
Do you need to have images, voice, and video on your website, or can it be enough to have text that helps deliver your results in that form, wherever your users search?
“Yes and no.
When I see you on the screen, David, whether I am an LLM or a human, I know you. You look like a professional. You look like you're in control. I can see the Casting Cred logo and your Majestic t-shirt. You have your glasses, which are like a semiotic anchor. We know it's you when we see those glasses.
You can get by with only text, but videos are an amazing thing because they pack a punch. The alt text used to describe you wouldn’t capture the concentration on your face, the sentiment, and all of those little details – but multimodal does. This means that, if you’re a brand and you’re trying to explain something, the easiest way to do that is often through video.
You can send messages through video that our eyeballs catch, as humans, and now machines catch them as well. If I saw the same setup, but it wasn’t you in that chair, would it have the same weight? Not necessarily. That's why owners or operators of companies could use little, short videos.
They don't need to be perfect, but think about how you're going to send a message. Is there a logo that a machine can read on your shirt? Is there a background? If you're a mechanic, does it scream, ‘I'm a mechanic’? All of these little things now matter a lot.
SEOs don't realise that multimodal search will align more naturally with what humans do. When we see a beautiful flower, we wonder what it is. Now, you can take a picture and ask. This is great.
There's not much data around what's going on, and I'm eagerly awaiting that. I've asked John Mueller, ‘Can we have information in the image report on Google Search Console? What's going on with Google Lens?’ Ultimately, though, it's a brand-new way of thinking about search.
We get to do a lot more stuff. I can take a video of my German microwave and ask, ‘How do I operate these things? I don't understand the icons or the text. Help me!’ This is cool.”
What does this mean for brand identity in general, and how has brand identity changed over the last few years?
“That takes me on to something else that is very important. When you talk about generative search, LLMs will hallucinate. They will come up with features that your product doesn't have, and then your customer service gets flooded.
Restaurants are so fed up right now because customers keep asking for specials that they never had, but the customers swear that they do. This is AI brand drift: where an AI will start making stuff up or misunderstand/misrepresent your brand.
If you're thinking about doing a very inside joke for April Fool's Day, maybe don't. One small indie company joked that they were acquired by a big, bad brand that nobody likes, and now ChatGPT keeps saying that they are part of this giant conglomerate. That’s a problem.
Make sure that what you say is consistent with your brand identity. Humans can understand subtlety, but machines can’t.
These are bull excrement machines. The difference between the truth, a lie, and bull excrement is that the truth is the truth. It's indexed on reality. A lie had better be a really good lie. If you ask me what I’m doing next weekend and I tell you, ‘I'm playing water polo, but with ponies in the pool,’ you will know that I’m lying to you. A good lie has to intersect the truth and not explode.
An AI model doesn't care. It can be true; it can be false. It is not tied to the truth, and that's a problem for brands.
That’s why I say that multimodal is amazing. Whether it be you on a podcast, you being interviewed on TV with an upload to YouTube, or images of you being uploaded, it doesn't matter. It's the same language, and you get to set the record straight. You provide even more content to anchor your brand in reality and minimise all that false information.
For example, a restaurant can add a page to their site that says, ‘These are exactly which specials we offer, and nothing else.’”
What do you mean when you say that product packaging and landing pages are interchangeable as discovery vectors, and both must serve as entry points that AI systems understand and present cross-modally?
“We often think that we’re controlling the experience. You want people to end up on your landing page or your product detail page, and you think that they’re going to read all of this and make a purchase.
Now, though, if I like your t-shirt, I'm going to take a picture of it and say, ‘Where can I buy this in red online?’ In that interaction, the product becomes a landing page. I landed on it, I took a picture, and then I sought it out. I almost don't need a landing page. If there were a button magically saying, ‘Purchase this here,’ and I could guarantee that it's good, I would click it.
This is very different from a brand thinking that you have the control, and you are the one explaining what the product is on your page. If your physical product is your landing page, then maybe it's time to also add frequently asked questions to the back of the product. Then, if I turn it over, take a picture, and ask about the ingredients, there's stuff underneath that says, ‘These ingredients are X and Y.’ I may not know what you're telling me with these words, but you're giving the information to ChatGPT, Gemini, and Perplexity, and they know how to contextualise it for me.
I have a friend who is allergic to any scent on a product. I don't know all the words that she would look out for, but I can take a picture and, if the ingredients and the text underneath explain it, the machine goes, ‘Yes, I can confirm that this will work for you.’ It could check online for other information, but the packaging itself provides enough context.”
How can we improve attribution and understand the value of different touchpoints in the new user journey?
“First of all, I think we've been lied to for a very long time about these metrics, thinking that we have full control. We did not, but we were very good at pretending that we did, and we hid behind that performance veneer.
The reality is, are these banner ads truly being shown to people? Are people really noticing these banners? Are they buried at the bottom of the page where nobody scrolls? It's a good time to rethink how we measure all of these elements, but I'm also tearing my hair out over it.
I'm waiting for ChatGPT and OpenAI to come out with their first advertising platforms. As soon as they start asking advertisers for money, we're going to have metrics.
On the other hand, I am very worried because there's no way to track a user taking a picture of your shirt and saying, ‘I want this in blue.’ As an SEO, I'm going to start seeing some of these queries popping up in my tools, and I'm going to notice that an entire chunk of that query is missing: the photo of the shirt.
That will also be true in relation to whatever other mode you've uploaded, whether it be a YouTube video, an image, a podcast, or a vocal command with text on it. We're going to have hints. We're going to have vague little crumbs telling us that this is happening. We're going to have big stats from Google telling us how many Lens searches are made, but beyond that? Nada.
That’s why I asked John Mueller whether Google could at least consider giving us insights, because they already have Search Console. OpenAI doesn't provide anything. I contacted Perplexity, and they said, ‘We're not allowed to discuss this in support.’ Then who's going to do it?
They want to make money, and we want to optimize. Maybe they need to grow up and realise, if you want SEOs to optimize and help your ecosystem thrive, you need to give us documentation and stats.”
Myriam, what’s the key takeaway from the tip you shared today?
“We used to focus on content marketing with an understanding that content marketing is text-heavy. We always went with content that is written.
Right now, there are multiple modes, and you don’t need to treat every single mode like it's a copy of the others. If you write an article, a video would be treated as a different type of content, even if it were the same text, because you provide additional information to generative search with that.
Multimodal search aligns more naturally with the way people search, and as marketers, you should align with those natural ways of searching. Provide images, provide videos, and provide audio whenever it makes sense – or assume that your content will be queried using the same modes, AKA, it all becomes one universal language.”
Myriam Jessier is a Consultant and Trainer at Pragm and Neurospicy. Find out more over at Pragm.co.