Cyber Security in a Generative AI world

CyberWaala
4 min readJul 17, 2023

If you have been living under a rock, or just simple too engrossed with the Threads vS Twitter debate — you would’ve heard about the term “Generative AI”.

I think we’ve had sufficient time to play around with this technology to not classify it into a “fad” and the genie is, in fact, out of the bottle.

Now what? Do we just let this potential Skynet situation lead our lives and hide under a bunker waiting for John Connor, or should we learn to harness this tech and use it to give us a competitive advantage in our day to day tasks?

Src

The latter.

There are lots of blog posts and YouTube videos out there which will teach you how to use these products so I’m not going to cover that. What this post will get into is how to look at this from a cyber security lens. (And yes- the PPT approach for all you cyber experts out there.)

The OBVIOUS choice — which a lot of the industry has taken- is to explicitly ban ChatGPT, Bard and the likes. Which isn’t wrong…just too aggressive. Or do you think if you ban something, your team will actually not use it?

As someone who was a former teenager with explicit rules for what NOT to do, trust me when I say that the word “NOT” is the biggest motivator to actually do that same thing.

Blocking such a tempting and new product isn’t easy, mind you. Even though we can deploy explicit blocking rules, setting up policies and even setting up detections for alerting when someone uses one of these products, we actually cant. Something as simple as using a personal device to access such products will literally put all of the mechanisms out the window to be completely destroyed by sheer logic and simplicity. Basically the opposite of a Michael Bay movie.

Instead, there needs to be a way to control the exposure of such products. Implementing technology guardrails that are better suited to provide controlled usage and at the same time, providing the rich feature set that is required by the team to flourish.

Before this even occurs, however, there needs to be a way to understand the use case in which these generative ai products will be leveraged and what gap they will be aiming to fill. As with any system design task, a comprehensive analysis of the system and its use cases is what will enable the strategists to predict and proactively prepare for any and all risks. It’s a tried and tested formula which has LITERALLY fueled a whole industry.

Staying true to the PPT approach, let’s talk first about the first P — People.

This is the trickiest problem to solve as its never one size fits all. The easiest approach in my opinion is to decentralize the whole problem. No I don’t mean blockchain here- I mean let every person have their own say. Instead of only allowing the leaders of the org to have a say in this discussion, open up the floor to everyone to pitch/share/explore/rebuke ideas and open up a dialogue surrounding the use of generative AI in the workplace. At the same time, it is the security team’s responsibility to educate those who are not aware of any cyber related risks of operating such products.

Solving the people aspect here is as much a shared responsibility problem as the next SaaS product ;)

Now, onto Process.

It’s very tricky to define a process for something which is so new and unexplored. However, most new technology started off at this point sometime in its progression — so let’s do it right this time around too.

One of the best ways to control the usage and gain visibility into the product is to define the actual processes though which the product can be used. Something as simple as a “usage policy” or even a defined “access request procedure” can go a long way in providing a structure and framework for leveraging such products.

The Technology aspect

This is actually not as hard as one might think in this case. Obviously there are security risks that have been published by renowned organizations about the vulnerabilities, etc. and companies should use their own tech stack and fold these defenses in. However at the crux of it, the problem is always going to be about the input data and the output data. The input data must be controlled in the sense that ultra sensitive data elements such as key secrets, IP, credentials, etc should not be used in public implementations of the product. If a use case requires such data to be the input, it’s wise to leverage the api and embedding and create your own vector data store location to have added control. As I said, never a one size fits all!

The output data is also a point of concern. The validity and accuracy of the output data must always be double checked to avoid any situation where it may be incorrect. There are lots of prompt engineering demos which talk about the power of inference through these products and that isn’t always the “correct” answer- just an inference.

All in all, it’s a fun time. Not too many people have an idea about the correct usage for. such products is and it’s one of the first times we have social media to share this with. This is a tech that shouldn’t be blocked based on initial fears. Instead- it should be understood and leveraged in a controlled setting.

At least that’s what Ultron would do. Right?

--

--